


THE JOURNAL OF 
EDUCATIONAL PSYCHOLOGY 








Volume XVI May, 1925 Number 5 








FURTHER EXPERIMENTS IN THE APPLICATION OF 
SPEARMAN’S PROPHECY FORMULA 


KARL J. HOLZINGER AND BLYTHE CLAYTON 
University of Chicago 
INTRODUCTORY 


In the May, 1923, number of this Journal a report was made in 
which some experimental results were compared with theoretically 
expected ones determined by the use of Spearman’s Prophecy Formula, 


Nl sz 
‘om T+ (n—l)tee @) 


Tan being the reliability coefficient obtained by pooling n similiar tests 
of equal length and reliability, and r,, the coefficient for the component 
tests. The material was furnished by Forms A and B of the Terman 
Group Intelligence Scale, each form consisting of 10 parts which were 
treated as component tests. The net result of this study was that the 
above prophecy formula based on the average of the individual relia- 
bility coefficients, r.., gave an over-prediction beyond the first four or 
five tests pooled. For such material, then, Spearman’s Formula 
should be used with great caution especially when several unhomogene- 
ous tests are pooled. 

The present study is concerned with the applicability of the formula 
to components which more fully meet the conditions of equal length 
and reliability. For this purpose two distinctly different types of 
material were used. In the first experiment equal time units of one 
minute were secured with the Otis Self-administering Test of Mental 
Ability. The results showed that the components were very unequal 
as to reliability and difficulty. When the first coefficient from the 
experimental series was substituted in equation (1) a significant over- 
prediction resulted, but when a suitably chosen value for r,, was 

289 








fF 


290 The Journal of Educational Psychology 


employed, fairly good agreement between experimental and theoret- 
ical results was obtained. 

For the second experiment the most carefully graded, homogeneous 
material available was employed. This consisted of parallel cycles 
made up of spelling words from the Buckingham Extension of the 
Ayres’ Spelling Scale. The agreement in this case between observed 
and expected values was remarkably close when either the initial or 
““best’”’ value for r,, was substituted in equation (1). It thus appears 
that with suitable material Spearman’s Prophecy Formula may give 
excellent results. With unhomogeneous or poorly graduated material, 
however, the application of the law is uncertain. 

Both of the above experiments will next be discussed in some 
detail, and certain alternative formulas given for the purpose of 
simplifying the verification of the Spearman Law and dealing with its 
probable error. 


EXPERIMENT WITH EQUAL TimE UNITs 


In seeking a test which could be split up into a number of equal 
units, the Otis Self-administering Test was chosen because it had two 
forms made up of apparently homogeneous material upon which a 
student works continuously for 20 or 30 minutes. Form A was 
accordingly given to a class of about 100 graduate students with 


TasBLE I.—MEANS AND STANDARD DEVIATIONS FOR SuccessIvE TIME UNITS ON 
THE Otis TEST 











-Form A Form B 
Time unit = 

Mean S. D Mean S. D. 

1 7.11 2.02 7.09 2.29 
2 6.40 2.09 6.99 2.17 
3 5.28 1.95 6.51 1.59 
4 4.77 2.52 5.21 1.39 
5 4.88 2.11 4.40 1.78 
6 4.43 1.71 4.53 1.88 
7 3.41 1.58 4.35 1.70 
8 3.67 1.71 4.24 1.50 
9 2.75 1.61 3.59 1.58 
10 2.61 1.38 3.27 1.44 




















inst! 
wor 
min 
folle 
for 

som 


Per cent failing 


fini 
nul 


are 
are 


dec 
is | 
mil 








il 


42 FT ane OS 





Application of Spearman’s Prophecy Formula 291 


instructions to make a small mark under the item on which they were 
working or which they had just completed as time was called out every 
minute and a half. Form B was given under similar conditions on the 
following day. This procedure made it possible to determine scores 
for 10 equal time intervals on both forms. Owing to the fact that 
some members of the class failed to follow directions carefully or else 















































1 an 4 
i at 


a re ae ey ) HO NE DD 
mm 


Figure |.Difficulty of individual items of Otis 
Self Administering Test of Mental Ability. 








finished the whole test too quickly to furnish 10 time intervals, the 
number of cases was finally reduced to 75. 

The means and standard deviations for the successive time units 
are given in Table I. It is apparent that the first items in the test 
are easier than the rest and that there is a general increase in difficulty 
toward the end of the test. The variability of the successive units 
decreases but not with regularity. This unevenness in the material 
is brought out more clearly when the difficulty of each item is deter- 
mined by the percentage of pupils failing it. Figure 1 shows that it is 


’ 

7 

as, 
a) 
‘ 

ie 
ky 
%> 
‘ ‘ 


ee — 


i 
ae 
ais 


wee 
i 
Se 
if 
‘ 
r 
} 


ee 
one 
Ps see ee 


¥ 


aa =o 


Diette . 








292 The Journal of Educational Psychology 


quite unlikely that equally difficult items would be encountered by a 
student either in equal time intervals or in corresponding periods on 
the two forms. We should, therefore, expect irregularity in the cor- 
relations for the separate time units, and the coefficients in the second 
column of Table II do show great variation. 


TaBLeE II.—RewiaBitity COEFFICIENTS FOR SEPARATE TIME UNITS AND FOR 
CuMULATIVE Poots or THEM, Otis Test 











Reliability of separate | Cumulative reliability 

oi time unit coefficients 

1 + .495 .495 

2 + .469 .579 

3 + .216 .604 

4 + .002 .649 

5 + .162 .739 

6 — .163 .749 

7 — .044 . 752 

8 + .404 .776 

.S) +.143 . 796 
10 — .090 .828 

PN 6 dc dic ccs tesecsancas + .159 











In spite of these low irregular values, however, the cumulative 
reliability coefficients show a regular increase up to .828. The 
cumulative coefficients were obtained by pairing the first time unit in 
Form A with that on Form B, giving .495; then pooling the first two 
time units on A and the corresponding two on Form B with the result- 
ing correlation .579, and so on. 

In Fig. 2 the cumulative coefficients have been plotted for com- 
parison with theoretically expected results. The latter were obtained 
by substituting in equation (1) the “best” value for r,, furnished by 
a method described below. The broken and smooth curves show 
trends which are roughly similar, and the deviations of the experi- 
mental values from the theoretical ones lie fairly well within the zone 
which is one standard error above and below the smooth trend. 

For practical purposes, however, we do not have a “best”’ value 
for r-z, but must employ the first coefficient or an average of several. 
With the present data the first reliability coefficient, 11, = .495, 


reliability 


Cumulative 


< 


it 





coe fficient 





Vv a 
on 
cor- 
ond 


FOR 


om- 
ined 
| by 
how 
erl- 
one 


alue 
ral. 
495, 


reliability coe fficient 


Cumulative 


Application of Spearman’s Prophecy Formula 293 


yields .907 when substituted in equation (1); the average, .159, 
furnishes r,, = .654. Both of these results differ considerably from 
the observed cumulative value .828. We thus find that although the 
Spearman Law is probably behind the trend of the data when a “‘best”’ 


= 






SO +9 








Best approximation 


i athe en ---- Experimental. 


mercesesessss Deviations one S.D. 
from the theoretical 
Approximation. 


t»> =v WZ 


-" 


/ 2 J 4 be | 6 7 & Y /0 
Number of tests cumu/oted 





Figure 2. Experimental trend compared with best 
dpproximation to Spearman's prophecy low.- 


Otis Intelligence Jest data. 


value for r,, is employed, this agreement is not present when r,, is one 
or the average of a set of irregular coefficients. 


Tue STANDARD ERROR OF SPEARMAN’S FORMULA 


In order to determine whether or not the variations from Spear- 
man’s Law are greater than might be reasonably expected by chance, 
it is necessary to determine the standard error of this formula. An 


' 

%, 
ha 
‘ 











2 94 The Journal of Educational Psychoogy 


approximation which will often give good results has been proposed 
by Mr. Shen.! His formula may be written 


n(1 — rz?) 2 
VN{l + (n — 1)rzz\4 ( ) 
where N is the size of the population dealt with. In getting out this 
result (which he gives without formal proof), Mr. Shen has evidently 
neglected to use the full difference function which gives 

n(1 — rzs*) 3 
~ ANI + (n — 1)real* + (n — 1)°[1 + (mn — 1)res)*{1 — re2?]? 8) 
This last result is readily obtained by setting up the difference function 
S(rzz — Arzz) — f(rzz), (where f(rzz) = Tan) Squaring, summing for all 


an 2\2 
samples, and making the substitution [A?r,,] = : ve c. For low 





OTan = 














values of rz, it is better to use formula (3), but for fairly high correla- 
tions (2) will often suffice. Thus if N = 100, r,, = .1 and n = 11, 
formula (2) gives .272 while formula (3) yields .244. The latter, of 
course is the more correct result and a difference of .03 can hardly be 
neglected. For N = 100, rzz = .5 and n = 11 we get for ofan, .0229 
and .0227 by the two formule, the difference .0002 being negligible. 
In this case equation (2) would suffice. 

We are now in a position to test the variations from the Spearman 
Law with the Otis Test material. Substituting N = 75, rzz = .159 
and n = 10 in equations (2) and (3) we find .190 and .176 respectively. 
Using either of these values we note that the difference, .828 — .654 = 
.174, may quite readily be ascribed to chance fluctuation and not 
necessarily to under-prediction. Next substituting r., = .495 in 
equation (3) we find, .029. In this case the variation .907 — .828 = 
.079 is probably statistically significant and we have some evidence 
of over-prediction. 

The results thus far indicate that agreement with the Spearman 
Formula is far from close with material which upon casual inspection 
would appear to be fairly well graduated and homogeneous. Extreme 
caution in the application of the formula to such test material would, 
therefore, seem to be very necessary. 

Before giving an account of the last experiment some alternative 
methods for testing and interpreting the Spearman Law will be 
presented. 





1 Shen, Eugene: Standard Errors of Certain Correlation Coefficients. Journal 
of Educational Psychology, October, 1924. 


ex 


is 
til 
ot 








,) 


is 


~ OO FP o " 


om ® 


7 


Application of Spearman’s Prophecy Formula 295 


SomE ALTERNATIVE METHODS 


Formula (2) may be written in another form by substituting the 
expression for r,, from equation (1). By easy reduction we may write 
[n ae (n at 1)raa]* _— Tun* 

OT nn . ./N (4) 

For N = 100,7,, = .55, and n = .11, this gives .272 as before. There 

is little to choose between formulas (2) and (4) but the latter may at 

times be more convenient and one form serves as a useful check on the 
other. 

We next turn to a simpler way of expressing Spearman’s Law. 

This may be accomplished by a transformation of equation (1). If 








we set a = — there results at once 
Z=—=atn (5) 


This last expression is at once recognized as an hyperbolic curve in 
Tan and n or a straight line with a slope of + 1.00 in the Z-n plane. 
The Spearman Law in the linear form is very simple and will be found 
convenient for testing experimental results of the kind here discussed. 
The data are given as n andra, It is thus only necessary to obtain 
the quotients Z = a and plot them against n to see if a straight line 


results. The value of a is best taken as an average of the differences 
Z-n (method of least squares). To illustrate the procedure, the 
Otis material may be arranged as follows: 





n Tan Z Z-n 
1 .495 2.02 1.02 
2 .579 3.45 1.45 
3 . 604 4.97 1.97 
4 . 649 6.16 2.16 
5 . 739 6.77 1.77 
6 .749 8.03 2.03 
7 . 752 9.31 2.31 
8 .776 10.31 2.31 
9 . 796 11.31 2.31 
10 . 828 12.08 2.08 
ick sare ac cu vasiabako > (sae ss +00 cseeb en seksmeedeban 1.94 
a = 1.94 = 2 ae. Therefore rzz = .34. 
zz 


The required line has the equation Z = 1.94 + n and is shown in 
Fig. 3. Considering the number of cases and the material, the fit 


i 
a 


a: 
; 
. 
& 
hed 
, 
' 
¢ 


whe 
oe 


Sai ts 
« ee ES 


es Le os 


ee 








296 





The Journal of Educational Psychology 


appears to be a fairly good one. It must be noted, however, that the 
values for a and r,, have been chosen so as to give the best fit possible. 
The initial coefficient, .16, or the average value, .50, would give much 
poorer results. 


131 
12 


I] 


—s & -.§ ?) om ff & 


— 























ey 2 ats 2 Se le Ue 
Number of tests cumulated 


Figure 3. Graphical test of Spearman's 


prophecy law in the linear form. 


In order to test the variations of-Z, its standard error is required. 


This may be obtained by noting that only a =~ 


— Tes 





need be taken 


into account in setting up the difference function. Thus we have 


a Arzr 
 Tes(Tez + Ares) 





AZ 








Sc 


m Mra ost hr er WS mM 


=~ ©. © cr 








it 


en 





Application of Spearman’s Prophecy Formula 297 


Squaring, summing for all samples, dividing by the number of samples, 











and substituting orz; = a Tes", we get 
V/N 
1 — res" 
°F MV tax!N F Paa®(L — Taa*)? “7 
For fairly large values of r,,, the approximation 
ve = 1 — Tes" (7) 
Tee? N 


will often suffice. These last two formulas will be found more con- 
venient to apply than (2) and (3) inasmuch as n does not appear and 
one calculation ‘vill be enough for the whole set of points. Substitut- 
ing N = .75 and r,, = .34 in equations (6) and (7) gives oz = .846 
and .883 respectively. The greatest observed difference for Z is 0.92, 
a variation easily within the range of chance fluctuations. 

The linear method is thus seen to give a simple means of _determin- 
ing a “‘best”’ value for r,, and of verifying the Spearman Law. These 
last results of course parallel those obtained with the hyperbolic form 
of the law. The curve plotted in Fig. 2 has the equation 


2 .34n 
Tan ="34n + .66 





EXPERIMENT WITH SPELLING MATERIAL 


Spelling words from the Buckingham Extension of the Ayres 
Scale were next chosen for experiment because they appeared to be the 
most accurately calibrated test material suitable for our purposes. 
Two tests of 105 words each were arranged with seven cycles in each 
test. Each cycle was made up of 15 words, five from column U, 
three from column V, two from each of columns W and X, and one 
from each of columns Y, Z, and AA. The corresponding words in the 
two forms were thus equally difficult and the cyclic arrangement made 
possible seven components each of increasing difficulty within itself 
but equal in total value to all of the others. 

The tests were given on consecutive days to 125 first-year high 
school pupils. Very fortunately the words had been so chosen that 
there was no bunching of the scores at either end of the scale and the 
correlations were, therefore, not affected by the familiar truncated 
distributions so often found with test material. The statistical 
results are given in Table ITI. 





298 


TaBLeE III.—ReEwiaBILiTy COEFFICIENTS FOR THE INDIVIDUAL CYCLES AND FOR 





The Journal of Educational Psychology 


CUMULATIVE Poots or THEM 
































Number of cycles! Reliability coeffi- | Cumulative relia- Theoretic al coeffi- 
; a ine cients with r.2 = 
pooled, cient for each cycle | bility upon pooling 743 
1 .743 .743 .743 + .040 
2 .737 .841 .853 + .026 
3 .788 .906 .897 + .019 
4 747 .916 .920 + .015 
5 .816 .941 .936 + .013 
6 .778 .949 .945 + .011 
7 .801 .955 .953 + .009 
4.00; 
98 
.< 
8 96 
Rea 
92 
> 
© 4 
eg 
> 86 
34 
S84 ff f/ 
32 Fg y / Theoretical 
f / 
8 80 if 4 oao kT es Experimental 
8 2 j f | tive Deviations one 
—_ f S.D . from the 
S 16 theoret col. 
D7 i 
f 
12 / a 
wile 2 3 4 5 6 7 


Number of tests cumulated 


Figure 4. £. xperimental trend compared 
with theoretical va/ues based upon first 
reliability coefficient.-Spelling test data. 








mi mh wee Fe HH 85 








Application of Spearman’s Prophecy Formula 299 


The theoretical values are obtained from equation (1) using .743 
as the value for r.z. The errors following each value are standard 
errors from formula (2). The above results are also given in Fig. 4 
from which it is at once evident that in no case is the variation from 
the theoretically expected result as much as one standard error. The 
fit is, therefore an excellent one, and the experiment shows that with 
such well-graded material the Spearman Law works out with entire 
satisfaction. 

It will be interesting to determine how nearly the first coefficient, 
.743, approaches the ‘‘best”’ value obtained by the average of the 
seven (Z-n) differences. The result for a works out at .338 and r,, = 
.747, the last result differing from .743 by only .004. Such close 
agreement is probably accidental, but either initial or ‘‘best”’ value 
for rz: gives excellent results in this problem. 


SUMMARY 


1. The Spearman Law may fail to give a satisfactory prediction 
with unhomogeneous, unevenly graduated test material when the for- 
mula is based on the first observed coefficient r,, or the average of 
several. 

2. With accurately calibrated test material the above law has 
been shown to give an excellent basis for prediction. 

3. Certain new methods and formulas have been introduced by 
means of which the checking of experimental results against the 
Spearman Formula is facilitated. 

4. By using “best”’ values for r., instead of initial or average 
experimental results, a fairly good agreement between experimental 
and theoretical results is obtained. This indicates that the Spearman 
Law is behind the trend of the data even though for purposes of pre- 
diction it cannot be directly applied in such cases. 





THE APPLICABILITY OF THE SPEARMAN-BROWN 
FORMULA FOR THE MEASUREMENT OF 
RELIABILITY 


TRUMAN L. KELLEY 


Stanford University 


Dr. Kate Gordon reports an interesting study on “Group Judg- 
ments in the Field of Lifted Weights” in the October, 1924, number 
of the Journal of Experimental Psychology. The purpose of the study 
is stated thus: ‘“‘The issue may be phrased as follows: Is the judg- 
ment of a group of persons any better than the judgment of the aver- 
age member of the group? This paper reports an experiment devised 
to answer the question in the field of lifted weights.” Dr. Gordon 
found that the average correlation of the arrangements of the weights 
by single judges with the true order of the weights was .41; that the 
average order given by averaging the results of five judges correlated 
with the true order to the extent of .68; for 10 judges to the extent of 
.79; for 20 judges to the extent of .86; and for 50 judges to the extent 
of .94. 

The increase in reliability with increase in number of judges is 
supposed to be given by the Spearman-Brown Formula.’ 


ar lr 


Taf; Af = 1 + (a a Dru ° 6 (1) 





We have here in the field of the judgments of weight an opportunity 


to test this formula. Dr. Gordon has calculated rho coefficients of 


2 
correlation [ rho ot tae Ni mm and has recorded averages to 


two decimal places. Recalculating the average correlations to three 
figures and finding for each rho the equivalent r with the usual formula: 





r = 2sin © rho 


we have the first two rows of the following table: 





1 Kelley, T. L.: “Statistical Method,” p. 205. 
300 








rT 


rr 





Spearman-Brown Formula for Reliability 


TaBLe I 


301 





Correlation with true order of weights 


























One Five Ten |Twenty; Fifty 

judge | judges | judges | judges | judges 

Dr. Gordon’s average, rho values..... .405 | .685| .790| .865) .945 

i tstiti‘“‘“‘éR .421 . 702 .803 .875 .949 
Reliabiltiy coefficients 

Pataeebik ba~nb ccebs deeb dveseuaelats Tr = Tsy = | Tiox = |Teoxx =) Tso, = 


177 | .493 | .645| .766/ .901 





ee me ne Sie .518 | .683 | .812/ .915 
Spearman-Brown reliability starting 

i << so tiCecevaeenesceaades ka rr .660 .795 .907 
Spearman-Brown reliability starting 

a a a a a as eat viet ee pas .784 .901 
Spearman-Brown reliability starting 

CTR 6incbdaddbedns4 weed wens emcee er jwle ee .891 




















Pearson product-moment r values, which are equivalent to Spear- 
man rho values, have been found because product moment coefficients 
of correlation are presupposed in the derivation of Formula 1. 

The correlation between a judge’s ranking and the true ranking is in 
fact the correlation between a fallible score of a single judge and a 
true score, and may therefore be designated r, ., in which the subscript 
1 represents the ranking of a single judge and the subscript ~ the true 
ranking. Following the same general type of reasoning as leads to 
Formula 1, one obtains Formula 2:' 


to = Vru (2) 


Ty, = Ff. 


or 


Thus, if we square .421 we obtain .177, which is the average reliability 
coefficient of the rankings of the single judges. Similarly, squaring 
.702 yields .493, which is the reliability coefficient of the average 
rankings of five judges, etc. These are the experimental values yielded 


1 Ibid., p. 206. 





Tas ee Bene. 4° 
ii Oe ee ———— 





302 The Journal of Educational Psychology 


by Dr. Gordon’s figures, with which we may now compare the results 
given by Formula 1. Having 1r:, equal to .177 we write: 








_ Sam 
Tw =T44(177) ~ 18 
. 10(.17) goa 


ox 1 + 9.177) 


etc. The value .518 may be compared with the experimental value 
493. The discrepancy .025 is small and well within the error limits 
expected by the size of population dealt with. Accordingly, the agree- 
ment between these two values is a very good indication that the 
Spearman-Brown Formula for determining reliability applies to 
such data as we have in hand. Nine other comparisons can be made. 
Retabulating them for ease of comparison we have: 

















TaB._eE II 
‘ Via Dr. Gordon’s 
Via Spearman-Brown : 
experimentally deter- 
formula 
mined coefficients 
eee Balt, Se Oe oe .518 .493 
aie a Seok ete hese tak awe .683 .645 
.660 .645 
is oo aia neds eld ae .812 .766 
.795 . 766 
. 784 . 766 
Weis eb gre hw ee ek ee ae ee .915 .901 
.907 .901 
.901 .901 
.891 .901 











All told, these values are in very close agreement. It is difficult 
to calculate the probable error of the differences between them, but 
as the rho value .405 is the average of 200 separate values we can cal- 
culate its probable error by the formula for the probable error of the 
mean. We find PE = .016. The greatest discrepancy in Table II 
is that between values .812 and .766. It may easily be shown by 
Formula 1 that r:, would need to equal .141 instead of .177 in order 








Spearman-Brown Formula for Reliability 303 


to have reoxx equal to .766. The difference .036 is a trifle over two 
probable errors. Thus this most extreme difference may still be easily 
considered to be just a chance error. 

Dr. Gordon concludes by saying that: ‘‘In other words, the results 
of the group are distinctly superior to the results of the average 
member, and are equal to those of the best members.”’” We may 
state this more explicitly and say that the increase in reliability is 
that forecast by the Spearman-Brown Formula. 














~FORMULAS FOR SCORING TESTS IN WHICH THE 
MAXIMUM AMOUNT OF CHANCE IS 
DETERMINED 


GEORGE FREDERICK MILLER 


University of Oklahoma 


The element of chance is involved in all kinds of mental tests. 
When a child is asked which is the longer of two lines, or which is his 
left ear, he might by chance point to the correct one. If in spelling 
he does not know whether it is m-a-i-n or m-a-i-n-e, he might get 
it right or wrong by chance. 

In one type of test the amount of guessing that the examinee does 
is not revealed by either the results or the form of the test. If the 
question is, What is the capital of Pennsylvania?, the examinee might 
know it is 1 of 2 names, or he might know it is 1 of 10, 48, or any other 
number. His finished paper does not show to what extent chance 
was a factor in his answer. In this type of question, where the maxi- 
mum amount of guessing is unknown, the only practical scoring is on 
the basis of the number of correct responses. 

In another type of test the maximum number of chances is deter- 
mined. In the true-false test, for instance, the guessing is limited 
to 2 items in each question, from which 1 may be chosen. In Test 3 
of the Army Alpha Examination, the examinee may choose 1 of 3 
items in each question, and in Tests 7 and 8 of that examination a 
choice of 1 out of 4 words, only 1 of which is correct, may be made. 
In this type the exact amount of guessing that the examinee does is 
not revealed objectively. If his paper is perfect, he might have 
guessed at none, or he might have guessed successfully at several. 
But his guesses are limited to a certain maximum number, so that 
they can be considered systematically in arriving at the score. 

In the true-false test the common way of marking is to subtract 
the wrong from the right answers, not counting those questions not 
marked. This method is expressed by the formula S = (S,; — U) — 
2W, in which S means the score that the examinee makes, S; means 
the maximum score for the test, U means the number of unmarked 
questions, and W means the number of wrong answers. 

When each question contains 3 items from which 1 choice may be 
made, and only 1 item is the correct answer, what is the best method 
of scoring? ‘The same principle should be applied in this case that is 

304 


0 LRTI Bg te NE ct = 








— —S —_—— 


\e 


1 
3 


Formulas for Scoring Tests 305 


commonly used for the true-false, score so the paper that has only 
the number right that would probably be obtained by chance will be 
marked 0, and papers that have a higher number of correct responses 
will be scored by the same formula, and will be marked proportion- 
ately higher. Since in the case of 3 possibilities, in which only 1 
correct choice can be made, the examinee can guess about 14 of them 
right, the paper with only 14 right should be marked 0. The formula 
for this case is: S = (S; — U) — 34W, in which the letters have the 
same meanings as above. If 30 questions are marked in such a test, 
and only 10 of them are correct, the score is 0; if 25 of the 30 are cor- 
rect, the score is 22.5. If the test is of a similar kind, but provides 1 
choice out of 4, the fraction in the formula is 44 instead of 34. If 1 
choice out of 5 is taken, the fraction is 54, and if 1 out of 6, it is &%. 

In order to make the formula more general other cases need to be 
considered. Instead of only one choice being correct, as is assumed 
in all of the above cases, the test may be so constructed that two or 
more choices will be correct; but the same principle stated above 
holds—when the number of correct responses is not greater than a 
chance marking would approximate, the score should be 0; and when 
the number of correct responses is greater than chance would approxi- 
mate, a correct score is obtained by using the same formula that gives 
0 when the number of correct responses is not greater than chance 
usually approximates. In cases where more than 1 choice is correct, 
the number correct must, of course, always be at least 1 less than the 
number of items from which the choices are made. Furthermore, if 
the choices permitted are greater than 1 out of 2, the value of the test 
is needlessiy weakened by increasing the element of chance. For 
example, if 3 items are provided, 2 of which are correct and will count 
on the score if chosen, and 2 choices are permitted, then 1 of the 
choices will inevitably be correct. If the score in that test is 
found by subtracting the wrong answers from the right, the score will 
have to be either 0 or 2. Another illustration is the test in which 13 
items from which to choose are given, 9 choices are permitted, and 9 
correct answers are possible. It will be noticed that only 4 of the 9 
choices can be incorrect. If the examinee makes all of the mistakes 
that he possibly can, and his wrong answers are subtracted from his 
right, he will still have 1 point to his credit. It might be assumed that 
no tests would ever be constructed allowing a greater proportion of 
choices than 1 out of 2, and that it is folly to mention such possibili- 
ties. But the two examples just mentioned are taken from a printed 


= 





Remegc- Maser 
ae 





306 The Journal of Educational Psychology 


standard test. The cases mentioned below that contain choices in 
greater proportions than 1 out of 2, are mentioned only in deriving a 
general formula, and not to endorse their use. 

Applying the above principle to the case of 2 responses out of 3 
being correct, 24 of the questions marked correctly would be the 
approximate result of chance, and should be scored 0. The fraction 
for 3 out of 5 would be 3; for 3 out of 6, 3g, etc. The formulas for 
these cases taken in the order above, and letting U equal 0, are: 


S=8,:-—3W; S=8,—%W; S=S8, — %W; ete. 
To illustrate, suppose that a test contains 30 series of 5 items each, 3 


of which, if marked correctly, count on the score, that 36 are wrong, 
and that U equals 0. Then substituting in the formula gives: 


S = 90 — &% X 36, orS = 0. 


The principle of correcting scores for chance has just been applied to 
two types of series. The first was that 1 choice was allowed from any 
number of items, only 1 of which could count on the score. The 
second was that an indefinite number of choices was allowed from an 
indefinite number of items (the latter always being at least 1 greater 
than the former) and that the maximum number of items that could 
count on the score was equal to the number of choices. Two more 
types need to be considered. 

Suppose, to illustrate type 3, that the series is 3 chosen from a 
group of 5, and not more than 2 of the 5 items can count on the score. 
The values for these 3 quantities, when the series is divided by 3, are; 
1, 544 and 4%, respectively. In this form the first 2 quantities are 
similar in form to those in the first type explained above, because the 
first is unity and the second comes within the term ‘‘any number of 
items.”’ The third quantity, the maximum number in the series that 
can count on the score, has no effect on the fractional part of S; that 
chance marking will approximate. It affects only the value of S:, 
and does not enter into the product subtracted from S, — U in order 
to find the value of S. Then when the form 3 out of 5 and 2 right is 
reduced to the terms 1 out of 54 and 34 right, the amount of chance 
involved in it can be calculated by the same process as the first type 


referred to before, which was of the form 1 out of 5and 1 right. Now 


in the case of 1 chosen from 5, sx" or aS is the product sub- 





tracted from S: — U. Applying the same principle to the case of 1 





1 Frasier, George Willard and Armentrout, Winfield D.: ‘Standard Achieve- 
ment Test on an Introduction to Education,” Test IV. 














Formulas for Scoring Tests 307 
5 
out of 54, the 54 must be substituted for 5, which gives a x. 
Placing this term in the formula gives: 
™ 7 X W. 
oe ~ 0) "i 


The following problem will serve to illustrate this formula: A test 
contains 15 series of 5 items each; 3 items in each series are to be 
marked, and a maximum of 2 of the 5 items are correct answers and 
count on the score if correctly marked. If W equals 12, and U equals 
0, what is the score? Substituting in the formula gives: 





mm % X 12 " 
S = 30 — 4 or S = 0. 


A fourth type of series is possible. Suppose that the series is 2 
chosen from a group of 5, and correct answers can be made from 3 of 
the 5items. It can easily be shown that the element of chance is the 
same in this type as in the preceding one. About 34 of the maximum 
score would be obtained by chance. The maximum score for the series, 
2, would also remain the same. So type 4 is solved by the same proc- 
ess as type 3. 

In order to arrive at a general formula for all of the above types, 
it is necessary to use a symbol for the quotient resulting from the total 
number of items in the series divided by the number of choices, or in 
case the fourth type of series is not changed into the third type before 
solving, this quantity will be the reciprocal of the fraction expressing 
the part of the score, S:, that chance marking will approximate. Let 
n represent that quantity. Then the general formula becomes: 

§ = (S, — U) - ay (1) 

The question that may arise is, What is the use of scoring by the 
principles of Formula 1? This question is partly answered by an 
application of the formula to scoring some standard tests. 

1. In the Army Alpha Examination where 1 choice out of 4 is pro- 
vided as in Test 8, and only 1 of the 4 is correct, suppose that 4 ques- 
tions are unmarked, and that 27 of the remaining 36 are wrong, what 
should the score be? Substituting in the formula we have: 

s = (40-4) - “Fors =0. 


1The formula may be simplified by letting n be reciprocal of the fractional 
part of the score, S,, that will be wrong by chance marking. The formula then 
becomes, 











S = (S,; — U) — nW. 











308 The Journal of Educational Psychology 


The authorized scoring in this case gives a score of 9 points, which 
means that the examinee probably gets 9 points more than he deserves. 
2. The next illustration is taken from Test 3 of a “Standard 
Achievement Test on an Introduction to Education,’”’ which was men- 
tioned above. In this test 8 items in column A are to be paired with 
8 in column B. Each item can be paired in only 1 way. If only 5 
correct groupings are made, and U is 0, the formula gives: 
Ss =8-2**, org = 444 
The instructions given by the authors of that examination for scoring 
this test (R — W), wouid give a score of 2, when only 5 correct answers 
are made and no question left unmarked. 
A further reason for correcting scores for probable chance marking 
is shown by the data that follow: 
TaBLE I.—AN ILLUSTRATION OF THE DIFFERENCE IN THE USE OF THE FORMULA 


AND THE Usvat Metuop or CaLcuLaTING SCORES, IN THE CASE OF TESTS 
IN WHICH THE Maximum Amount or CHANCE Is KNown 



































Scores from 
Arey Alpha Group | Otis Group Intelligence Haggerty Intelligence 
xamination Scale, Form A Examination, Delta 2 
Numbers of 
tests By | By | By 
F ul F ul F ul 
Chance > wand ‘. Chance —- ‘and Chance >) ‘and ‘ 
2 ota 5 6 1 
3 5 5 0 
5 5 6 1 
6 es id ms as a vs 10 ll 1 
7 10 7h. 5 5 0 
8 10 15 5 4 1 0 
9 8 7 0 
10 10 9 0 
Totals.....) 25 | 27 |, 5 32 | 28 1 15 17 2 
































Explanations: 1. ‘‘Chance” here means the score that would probably be made 
by chance. In the Army Alpha Examination, Test 3, there are 16 questions, and 
the choice is 1 out of 3 in each question. So the probable score by chance is 5}4. 

2. ‘Actual trial”” means that a chance marking was actually made for the 
tests. Cards were used. In the case of Test 3 just mentioned, 16 cards were 
numbered 1, 16 were numbered 2, and 16 were numbered 3. They were shuffled, 
and 16 of them drawn. The items were marked in the order in which the cards 
were drawn. 

3. The score made by the actual trial was used in the formula to correct it for 
chance. If the chance scores were used in the formula they, of course, would all 
give 0. 


a fF @® fo 


ee 2, ee a ee ee ee | 


Pe ee a le 








wi =| +$t'|- wwe 


Formulas for Scoring Tests 309 


It is very significant to notice some of the results obtained from some 
intelligence tests when no compensation for chance is made. The 
Army Alpha Examination, for example, gives a score of 25 by pure 
chance (Table I). In practice the grade of C— was given for scores 
from 25 to 44 on that examination, and the grades of D and D— for 
lower than 25. Now it is evident that score 25 if made only on Tests 
3, 7, and 8 is the probable 0 mark, and that the lowest C— individual, 
as well as all below him, were of such low grade intelligence that they 
could not be measured by the test and the method of scoring. Scores 
of 25 and lower may be due entirely to chance, and have no significance, 
unless they are made on tests other than 3, 7 and 8; or unless credit 
for intelligence is given for merely holding a pencil and marking in 
certain places at random in Tests 3, 7, and 8. 

The Army Examination was the forerunner of numerous intelli- 
gence examinations of a similar kind, which followed the same method 
of scoring—right answers minus wrong in tests where the choice of 1 
out of 2 was given, and counting on the score all those right in all 
other tests. 

Another instance from Table I may be taken from the Otis Group 
Intelligence Scale, in which chance marking gives a score of 32. This 
mark of 32 means, according to the norms worked out for the test, a 
Binet Mental Age of 8.6 years. If compensation is made for chance, 
the score of 32, if made entirely on Tests 2, 7, 8, 9 and 10, becomes 0. 
It would be absurd to suppose that all children who make 32 or lower 
on the test have 0 intelligence. They might make some of the score 
on tests other than 2, 7, 8, 9 and 10, or have intelligence that the 
examination does not measure. Because a 15-pound child weighs 0 
on scales used for weighing freight cars and graduated in no smaller 
units than 100 pounds, does not prove that the child has no weight. 

In case only the relative standing of a group of individuals is 
desired, and a single test is used, the whole number of right answers 
used as the score serves the purpose as well as correcting the scores for 
chance. In a true-false test of 50 questions, for example, if the cor- 
rect scores run from 25 to 50, the person who makes 25 is at the bottom 
of the list, whether his score is recorded as 25 or 0; and all of the others 
remain in the same relative positions, but not separated by the same 
number of points, regardless of whether the formula is used or not. 
But if scores are obtained from various tests, unlike in the probable 
effect of chance on the scores, the importance of adjusting the scores to 
correct them for chance becomes evident. If scores are not corrected 


} 4 
ie 
i a2) 74 
t, 
“ 





310 The Journal of Educational Psychology 


for chance, a score of 50 may be made by one person on tests where 
chance is practically eliminated and by another on tests where 14 of 
the score may be due to chance. The same scores of 50 for each would 
indicate very different amounts of intelligence. 

The distorted relative values obtained from the usual method of 
scoring is further illustrated by the scores made by three individuals 
on the Army Alpha Examination. 


TaB_e II.—Actruat Recorps or THREE EXAMINEES ON THE ARMY ALPHA EXamtI- 
NATION, Eacu ScorEeD 1n Two Ways, BY THE FoRMULA 1 AND ACCORDING 
TO THE INSTRUCTIONS IN THE MANUAL 





























Individuals 
A B C 
Number of tests Re 
Regular — Regular — Regular — 
a formula | *°°F® formula — formula 
1 9 9 3 3 9) 9 
2 12 12 8 s 2 2 
3 15 15 5 5 4 3 
4 38 38 11 11 6 6 
5 20 20 5 5 7 7 
6 13 13 4 4 7 7 
7 37 36 2 0 7 4 
8 23 17 17 13 3 0 
ine pen eae 167. 160 55 49 45 38 
Percentage of loss 
due to formula 
marking......... | 4 10 16 














These meager data indicate that the lower the score the greater the 
percentage due to chance. By correcting scores for probable chance 
marking, the bright are separated from the dull more distinctly, and 
the effectiveness of the test is improved. 

Scoring according to Formula 1 is the universal practice for tests 
in which there is a choice of 1 out of 2 (the true-false test), but for 
tests in which the choice is other than 1 out of 2, the general practice 





a ee ee ee SO eS OT OT oo 





Formulas for Scoring Tests 311 


is to count the number right as the score. If chance is recognized in 
calculating the score in tests where the choice is 1 out of 2, why should 
it not be considered in determining the score when the choice is 1 out 
of 3, 1 out of 4, 2 out of 5, and in other cases where the maximum 
amount of chance is known? It is evident that the advantage from 
guessing is greater where there are only 2 items from which 1 is to be 
chosen, than where there are three or more from which 1 is to be 
chosen. The greater the number from which a choice is made the 
less probability there is of guessing the right answer. If the number 
is about 50 or more, as in paired vocabularies, the total right is about 
the same as the score computed on the basis of the above principle. 
If, for instance, the test contains 50 questions, and only 25 of the 
answers are correct, the score is 24.5; but if only 25 of 50 questions are 
correct in a true-false test, the score is 0. 

Conditions sometimes arise in tests with a fixed number of choices 
and a fixed number of items from which to choose that are not pro- 
vided for by Formula 1. If the number of choices that the examinee 
actually makes is greater than the number allowed, how should the 
score be calculated? A deduction in the score proportionate to the 
advantage gained by extra choices should be made. If the total 
number of items in a test equals 15, the number allowed to be chosen 
is 5, and W equals 0, it is evident that if 15 choices are made the score 
would be 0. Also, if the choices actually made are only 5, and W is 
0, the score would be perfect, or 5. The range between 15 and 5 
points is 10. If the examinee marks extra points to the extent of all 
the range, he is penalized 5 points, or 5/10 of a point for 1 unit of the 
range. If the number actually marked is 7 the amount deducted 
would be 5/10 (7 — 5) or 1. Expressed in the form of an equation the 
process of deduction from the score is: S = 5 — 5/10 (7 — 5). If 
n; is the number of markings allowed in a series, if N is the sum of the 
number of markings made in excess of n; for each series of the test, 
and if ne is the total number of items in a series, then the equation in 


general terms is: 


n,N 
Py ae (2) 





This formula is to be used for cases in which W and U both are O. If 


W and U do not equal 0, Formula 1 and Formula 2 must be combined, 
which gives: 





n 


g = (8,- 0) - (+) (3) 


Ne nN 














312 The Journal of Educational Psychology 


Elements may be omitted from Formula 3 to meet various needs. 
nW 


If U equals 0, U is omitted; if W equals 0, ag is omitted; and if N 


Ny N ° ° 
equals 0, own omitted. 
For a simple application of Formula 3, suppose that 30 questions 
are given in a true-false test, that 40 of the 60 items from which 
choices may be made are marked instead of the 30 allowed, and that 5 
are marked incorrectly. S, will then equal 30, U = 0,n = 2, W = 5, 

n, = 30, N = 10, and nm. = 2. Substituting in Formula 3 gives: 
2X5 1X10 


— or S = 10. 


tot. =) ae oS 


This score is the same as the one obtained by the common practice of 
scoring the true-false test by counting all questions that are marked 
both ways as unmarked, and subtracting the wrong answers from the 
right. If all tests in which the maximum amount of chance is deter- 
mined were the simple true-false kind, the above formulas would not 
be necessary. When, however, the series contain many items, some 
of which are omitted by the examinee, some overmarked, some 
marked correctly and some incorrectly; when several choices are per- 
mitted in 1 question, or when 1 choice is permitted from several cor- 
rect items; and when several questions are included in a series; the 
problem becomes too complex to be solved by simple inspection. The 


following scheme will indicate the method of finding the values of 
U, W, and N. 


Explanations: 1. The whole plan represents a test of 11 series, with 7 items in 
each series. From each series of 7 the examinee is to choose 2. 

2. IR means an item that if chosen, or marked, will count 1 point on the score, 
provided the whole number chosen in the series does not exceed 2, in which case the 
additional IR marked counts 1 overmarking (1 unit on the value of N). 

3. I is used to represent items other than IR. If an I is chosen, or marked, it 
will count: (a) wrong, if the number of marks for the series does not exceed the 
number allowed by the directions for marking the test, or 2; (b) overmarked, if 
the number of marks for the series exceeds the number allowed by the directions 
for marking the test. 

4. The directions to the examinee for marking the test are: ‘‘Encircle 2 correct 
items (words, figures, symbols, etc.) out of the 7 in each of the series.” 

5. Of course, in practice the letters IR and I would be replaced by meaningful 
items preceded by a statement, as: Mark 2 of the following that are usually liquids: 
map, fish, ink, country, milk, water, putty. 


7 | 


nm => Se SS ee 4, 











Formulas for Scoring Tests 313 
NUMBERS VALUES OF 
or SERIES ITeMs U W N 
1 I (I) IR I (RR) IR I 0 1 0 
2 I I (I) I IR IR IR 1 1 0 
3 IR (I) I I I (IR) (IR) 0 0 1 
4 (IR) I I IR IR I I 0 0 0 
5 I IR (I) (IR)() IR I 0 1 1 
6 I IR I I I IR (IR) 0 1 0 
7 (IR) I (IR) (IR) I I I 0 0 1 
8 I I I I (IR) IR (IR) 0 0 0 
i) I €«) (R)(I) IR IR I 0 1 1 
10 I (IR) IR I IR I I 1 0 0 
11 (IR) (IR) (IR) I I I I 0 0 1 
a SG A a ia a a 2 5 5 


In addition to the values of U, W, and N expressed as totals in the 
scheme, values for the other letters of Formula 3 when applied to the 
scheme are: S; = 22, n = %, n, = 2, and ne = 7. Substituting 
the values in Formula 3 makes: 


44 xX 5 
15-1 


2X5 
7—2)) 





S = (22 — 2) -( i‘ or S = 9%. 
Confusion in the understanding and use of the above formulas will 
be prevented by keeping-in mind the following definitions of the terms 


and symbols used in this article. 


(A) TERMS 


1. Item means the word, figure, or other symbol that forms the 
smallest part of an examination. For example, Test 8 of the Army 
Alpha Examination contains 160 items, 4 in each of the 40 questions. 

2. Question means an item that counts on the score if correctly 
marked; or the question is a group of items only one of which counts 
on the score. In the scheme of 11 series given just above there are 22 
questions, 2 for each series. 

3. A series is a group of items that are marked as a unit. The 
different series in various tests do not have a fixed number of items, 
but all the series in a given test must be uniform in the total number 
of items each contains, in the number of choices permitted, in the 
number of items that count on the score, and in the number of items 
from which correct choices can be made. Tests are usually so con- 
structed that a series contains only 1 item that counts on the score, 


oe 











314 The Journal of Educational Psychology 


only 1 that may be chosen, only 1 that can serve as a correct answer, 
and 2, 3, 4, or 5, items from which a choice is made. In the case not 
more than 1 item in the series can count on the score, the series means 
the same as a question. 

4. Test means a number of uniform series, provided it is a test in 
which the maximum amount of chance can be determined. Test is 
also used to designate various other kinds of groups of questions. 
Test 1 of the Army Alpha Examination is an example of one of the 
other kinds. 

5. Examination means a group of tests. It is the largest unit 
considered in this article. In printed form it is the whole booklet or 
folder. Scale is sometimes used in this sense, as ‘‘Otis Group Intelli- 
gence Scale.’’ Test is also used in this sense, as, ‘‘Standard Achieve- 
ment Test on an Introduction to Education.” The last named 
contains 4 major divisions, each of them called also ‘‘ Tests.” 


(B) SympBots 


Note.—The letters used are of two kinds: (1) Capitals, which 
designate values for the test as a whole, and (except S,) whose values 
generally cannot be found by multiplying the number of series by 
some other factor. (2) Small letters, which designate values for each 
series of the test, and are constant for each test. 

1. S means the actual score that an examinee makes on any test. 

2. S; means the total number of items of a test that can count on 
the score. S, is the maximum score for a test, and is equal to the 
number of questions in the test. 

3. U means the number of items in a test that count on the score if 
marked, minus the number that is marked. To find the value of U 
each series of items must be checked separately, and the number 
unmarked in each series added to make U. 

4. W means the number of items that count on the score that are 
not marked, and instead of which some other item is marked. The 
value of W must be found by checking each series separately as 
described for U. 

5. N means the sum of the number of markings made in excess of 
n, for each series of the test. 

6. n means the reciprocal of the fraction that expresses the part 
of the score (either for a series or the test) that chance marking will 
approximate. 








l 
3 
. 
l 





Formulas for Scoring Tests 315 


7. n, means the number of items in a series permitted to be marked 
by the directions for marking the test. 
8. nz means the total number of items in a series. 


CONCLUSIONS 


1. The current practice in scoring tests in which the maximum 
amount of chance is known, by counting the number of right answers 
as the score (except in the true-false test), contains a fault that can 
readily be corrected. 

2. The correction can be made in most instances (in which a choice 
of 1 out of 3, 4, 5, and the like items is allowed) by Formula 1, which is: 


Ww 
§ = (S:- U) -—~ 





3. Formula 3 is a general one, which will serve where other cor- 
rections are desired. This formula is: 





2— nN 








A SOCIAL ATTITUDES QUESTIONNAIRE 
PERCIVAL M. SYMONDS 
Teachers College, Columbia University 
Education for the future is education in openmindedness. 
In order to test this objective of education, a questionnaire was 


prepared which really is nothing but a large ballot. Over 100 ques- 
tions concerning present day living were assembled. Samples are: 


1. Is it desirable that schools be permitted which are conducted 


in a foreign language? Yes No 
9. Should automobile drivers be given license without exami- 
nation? Yes No 
34. Should society deny any man the right to work? Yes No 
59. Should the feeble-minded be educated? Yes No 
85. Should the city maintain playgrounds? Yes No 
109. Should an accurate record of births, marriages, and death be 
kept by a public agency? Yes No 


In order to obtain a key each question was answered by five persons: 
a sociologist, an English professor, two psychologists, and the writer, 
with what each considered the liberal, progressive, or radical position. 
Liberal, progressive, and radical were not defined except as being the 
opposite of conservative. It is not thought that these three terms are 
synonymous, but that they contain in common a point of view which 
would set a definite answer to each question. Questions which were 
answered with three yes’s and two no’s or two yes’s and three no’s were 
thrown out, leaving 115 for which the issue was considered definite. 
These were worded so that there was an equal number of yes and no 
liberal answers and were then placed in random order. It is not 
contended that there is a right or wrong answer to these questions, but 
for a key the answers which were given as liberal, radical, or progres- 
sive were used. Many other objections can be raised to such a ques- 
tionnaire. One of the most frequent is that many questions can not 
be answered by either yes or no until they are qualified. In practice 
this objection disappears. If the questions were intended to be 
answered analytically after a carefully thought-out perusal of the 
issues, undoubtedly qualifications are necessary. But if one gives 
offhand his first reaction to each question, he sees that he naturally 
tends to say yesorno. It is this impressionistic answer that is desired 
rather than any reasoned out answer. At least in trial it was found 
that a class could run through all 115 questions in a half hour, a speed 
of work which precludes any very lengthy analytical thinking. 
316 


pete ee tee ee 








Fe we SP Oe ef 


a. EE Eh hUc-hC 


be ee ce | Be \“v \y \e cr ' 





A Social Attitudes Questionnaire 317 


It may further be objected that since this paper holds the liberal 
attitude to be the desirable one, it holds that the liberal side of each 
question is the desirable one. No such brief is held. While I am 
maintaining that liberalism should be the objective and product of 
education, I do not hold the validity of the liberal side of many 
questions as practicable at the present time. They are included in the 
questionnaire simply as a means of sifting. 

The tests were given from Grade VIII in Honolulu public schools 
through the University of Hawaii. Since in the directions permission 
was granted to any pupil to omit any question which contained 
technical terms with which he was not familiar, some items were not 
tried. However, this rarely ran to more than two or three items a 
paper, as pupils seemed to have an opinion on almost every topic. 
Because of these omitted items and because what was desired was the 
ratio of liberalness rather than an absolute score (it should be remem- 
bered that this is not a test but a questionnaire), the score is given in 
terms of per cent—per cent of the questions which were answered on 
the liberal or progressive side. The distribution for college freshmen 
is as follows: 


DIsTRIBUTION OF COLLEGE FRESHMEN ON THE SOCIAL ATTITUDES QUESTIONNAIRE 
Score IN 


Per CrentT FREQUENCY 
98-100 _ 
96-97 —_ 
94-95 2 
92-93 — 
90-91 2 
88-89 2 
86-87 6 
84-85 11 
82-83 15 
80-81 19 
78-79 10 
76-77 11 
74-75 7 
72-73 4 
70-71 5 
68-69 2 
66-67 — 
64-65 —_ 
62-63 1 


97 








318 The Journal of Educational Psychology 


MEANS ON THE SOCIAL ATTITUDES QUESTIONNAIRE FOR DIFFERENT CLASSES 


MEaN, Per 

N CrEntT 

I: 5 a duit Sontag ss abeens ae sss eae 8 80.3 
EEE CE ene ee Se 25 81.5 
NS cnc nacne es ane oktbn been a6 64 53 82.1 
ee 6 Xs yeas hee SOE ORR OS OOS 95 79.3 

High school seniors...................0eee0e0e. 58 80.5 
liste dian ns «4 thie bale Gee’ ne 57 77.6 
EE ee Pee Se 50 78.4 
SAFER eT one 48 77.9 
cis waco gbarkac seer eee. Knees 37 79.7 


The surprising thing gathered from an inspection of the means is 
that there is practically no change in the means from the Grade VIII 
through the university. Let it be said here in advance of the evidence 
that this questionnaire does measure liberalness versus conservative- 
ness and with a fair degree of reliability. Before the above phenome- 
non is discussed, I will give the results of an Information Test covering 
about the same field as the questionnaire. Samples of the questions 
in the information test are: 


2. Absent voting is staying away from the polls, voting after election day, 
voting by mail, voting by proxy. 
9. The Knights of Columbus is an organization of Methodists, Presbyterians, 
Baptists, Catholics. 
32. A coroner conducts a trial, service, inquest, vigil. 
76. Divorce is obtained in city, county, state, federal courts. 
96. The tariff is an issue usually sponsored by the democratic, socialist, farmer- 
labor, republican party. 


One hundred questions were asked. The means for the same groups 
follow. 


MEANS ON THE Socrat INFORMATION TEST FOR DIFFERENT CLASSES 


N MEAN 

College freshmen............ PAR AD| nee 87 73.5 
iio oa as Lean seds capae kek 33 66.5 
LES A c's 5 <s) wtin a dincemh Ghd ao 6 20 54.5 

io iL ws Cewek beatles 30 54.5 

bad a dtihsins eRe R ae kas 23 59.1 


In the information test the customary rise in means is found with 
progress in school; in the questionnaire no such progress isfound. No 
difference is found in average score on the questionnaire with increase 
in age. In one sense it is surprising that difference in means is not 
found, because the questionnaire has to be read, and apart from any 


~_ > _ a - , wh 





A Social Attitudes Questionnaire 319 


differences in liberalness of views, mere differences in reading ability 
led us to expect a change in the means. 

But that there is no difference in the means is doubly surprising 
when one considers that one’s attitudes are almost entirely the product 
of environment or of education in the large sense. It is possible to 
conceive of a person’s being born with a bias toward conservativeness 
or progressiveness. It is not more difficult (apart from social pres- 
sure) to be a conservative rather than a liberal—difficulty does not 
enter. But a child may be born with a predisposition to answer yes 
or no to such a question as “Is it desirable that the community require 
milk inspection?” and there may be something in the hereditary 
equipment of a child to make one answer more easy than another. 
Heredity may give resistance or facilitation to make an answer on 
one side or the other. But the answer to such questions is largely a 
matter of pure education. Education in its broad sense is responsible 
for these attitudes held on social issues. Hence it is surprising that 
strictly school education has been able to make no change in the above 
attitudes beyond Grade VIII. If there is anything that schooling 
should do, it should make children more liberal. Of all the objectives 
of education, training in broadmindedness would seem to be one easy 
of accomplishment. Schooling does give skill, increases one’s informa- 
tion and makes one able to do more difficult tasks, working as it does 
on human material of differing original capacity. 

Children evidently come to school with attitudes formed on many 
social issues. ‘They hear them discussed at home, in church, in the 
newspapers. ‘There is a general atmosphere in which we all live and in 
which we have our opinions formed. Why do the higher schools make 
so little change in their attitudes? Is it because they do not try to? 
Or is it that they do not know how to? Naturally before education 
can liberalize, teachers must be liberal. Are they? 

There is a definite sex difference on the questionnaire. 


AVERAGE, 


N Per CENT o ou ¢ aig. 
NG iia dk inih eed wie 112 79.7 5.14 .49 
a 93 78.4 4.54 .47 .68 


Why boys should be more liberal than girls is not clear. 

To test the reliability of the questionnaire a second form was pre- 
pared containing the same questions so expressed that the answer 
would be the opposite of the questions in the original questionnaire. 
For instance No. 57 in the original, ‘‘Should marriage be allowed for 


~~ es 














320 The Journal of Educational Psychology 


the feebleminded and those with sexual disease?’”’ was made to read 
‘‘Should marriage be prevented for the feebleminded and those with 
sexual disease?’’ ‘This second form was given to the college freshman 
group about a month after the original. No public discussion of the 
questionnaire was held in the meantime. The correlation of the two 
forms was +.67 for 102 cases. This reliability coefficient could prob- 
ably be raised more effectively by combing out questions which did 
not differentiate than by lengthening the questionnaire. For instance, 
32 selected items (to be given later) show a reliability coefficient of 
+.62 which could be raised to +.83 by making the questionnaire 
contain 120 items of the same merit. Getting a more discriminating 
rating on each question or on some questions by allowing pupils to 
indicate those that they feel strongly about or those on which they are 
fairly impartial should give still higher reliabilities. 

The validity of the questionnaire will have to rest on the method by 
which the liberal side of the elements was selected. Since each ques- 
tion was scored according to its liberal interpretation as judged by 
five competent persons, the total score ought to represent a person’s 
liberal attitude. Of course not every one necessarily answered accord- 
ing to his real attitude. It was possible to camouflage. How much 
of this was done it is not possible to state, but since the questions were 
answered with the seriousness of a school exercise, it is presumed that 
this camouflaging amounted to very little. 

An interesting fact to be gathered from the joint giving of the 
attitudes questionnaire and the information test is the low correlation 
between the two. As obtained from the data it is .28. Correcting 
for attenuation, using .67 as the reliability coefficient of the question- 
naire and .90 as the reliability coefficient of the information test (the 
latter found by correlating random halves and correcting by the 
Spearman-Brown Formula), the relation between liberalism and 
information becomes .36. This is a low correlation, lower than one 
would expect. One would expect the best-informed man to be the 
most liberal and vice versa, yet there is only a moderate relation. 
Careful observation of specific cases yields many contrary instances. 
Our national senators are an unusually well-informed group of men, 
and yet the majority are conservative; on the other hand your extreme 
radical may be an unlettered man who is dissatisfied with his condition 
and has accepted uncritically radical utterances of others promising 
relief. But it is true that there is a small positive relationship between 
liberalism and information among men as a whole. The moral of 








A Social Altitudes Questionnaire 321 


this is that you can not necessarily create attitudes in school by mere 
information-giving sort of instruction. A correlation of .28 is found 
between the questionnaire and intelligence as measured by the Thorn- 
dike Test for High School Graduates. The intelligence test and 
information test correlate as high as .56. What the main factors are 
lending to the formation of attitudes we do not know; we surmise, 
however, that more important than the intellectual factors are the 
emotional. 

In the college freshmen group the papers of the 23 with the highest 
total scores and the 23 with the lowest score were scrutinized in order 
to determine the significance of each item. Thirty-two items were 
selected for which there was the greatest discrepancy of liberal answers 
between the two groups. 

These questions, put in an “order of difficulty,” are 


1. Should the consumption of natural resources such as lumber, 


coal, oil, be under governmental control? Yes No 
2. Should the city hold community pageants or celebrations? Yes No 
3. Should mosquitoes be killed at public expense? Yes No 
4. Should a city provide parks liberally? Yes No 
5. Should any man be required by his employer to work regu- 

larly over eight hours a day in industry? Yes No 


6. Should banks be allowed unlimited discounting privileges? Yes No 
7. Should stock companies be allowed to capitalize above the 


real value of the company? Yes No 
8. Should the city adopt a zoning system permitting only certain 
kinds of buildings in certain zones? Yes No 
9. Should prisoners who labor be denied all recompense? Yes No 
10. Should clothing for public sale be allowed to be made in the 
homes of the workers? Yes No 
11. Should the refuse of private homes be carried away at public 
expense? Yes No 
12. Should America follow a policy of isolation? Yes No 
13. Should there be a minimum wage law? Yes No 
14, Should surplus earnings over a certain amount be distributed 
to the employees? Yes No 
15. Should one purchase in stores clothing that has been made in 
the homes of the workers? Yes No 
16. Should the government regulate prices in any way? Yes No 
17. Should inheritance be taxed? Yes No 
18. Is the excess profits tax a desirable form of taxation? Yes No 
19. Should all prisoners do hard labor? Yes No 
20. Should religion be taught in the public schools? Yes No 


21. Should all insurance against sickness be a private matter? Yes No 





be 
a 
m 
#. 

4 

f 


awk” 2 Ge 











322 The Journal of Educational Psychology 


22. Should there be governmental control of telephone, tele- 


graph, and cable companies? Yes No 
23. Should all insurance against disability be an individual 

matter? Yes No 
24. Should insurance against accident be a wholly individual 

matter? Yes No 
25. Should Japanese be permitted naturalization in America? Yes No 
26. Should the government regulate in any way the pay which a 

worker receives? Yes No 
27. Should prisons keep the convict stripes? Yes No 
28. Should criminals receive a sentence of definite length? Yes No 
29. Should all persons have for support in their old age only their 

own savings or the aid of their friends or private charity? Yes No 
30. Should the ch ef of police be elected by popular vote? Yes No 
31. Should the city own and operate moving picture houses? Yes No 
32. Should the metric system supplant our present system of 

weights and measures? Yes No 


Cursory examination of these 32 items and of the 83 that were not 
taken shows several things. (1) Practices that are now in operation 
do not lead to questions which discriminate, and the answers are usu- 
ally favorable to the practice. The question ‘Should streets be kept 
clean at public expense?’’ was answered yes by as many high score 
as low score pupils. Yet I venture that many years ago when streets 
were not kept clean at public expense even this question would have 
yielded answers tending to discriminate between the liberals and the 
conservatives. (2) Questions of government control and government 
benevolence are highly discriminative. Other topics which seem to 
discriminate are those having to do with religion, race, the treatment 
of criminals, taxation, international relationships and economic control 
by the government. (3) Questions about which there is marked 
differences of opinion but which at the same time do not discriminate 
between liberals and ¢onservatives are those relating to private 
schools, marriage, the tariff, local control of schools, industrial rela- 
tions, immigration, alcohol, and censorship. 








ae \e vv ' = — 


\e 


ll ll bball = O—_—_—_TTerltCit Tr 





AN ANALYSIS OF CERTAIN DIFFICULTIES IN 
FACTORING IN ALGEBRA 


E. L. DICKINSON AND G. M. RUCH 


University of Iowa 


The Problem.—The two most fundamental weaknesses of the treat- 
ment of factoring in algebra today probably are: (1) The inclusion of 
a number of “‘cases’’ which have little or no real utility for the average 
pupil, and (2) failure to introduce the symbols actually common to 
the ordinary applications of algebra, e.g., in physics, mechanics, and 
even college mathematics. This opinion, at any rate, has grown from 
a more or less careful analysis of about 10 high school texts in algebra, 
all of which have earned a fair degree of popularity. 

With respect to the first of these criticisms, the typical textbook 
treatment seems to be that of the formal presentation of from five to 
eight special cases of factoring together with the type formula for each. 
The best modern opinion! would limit the number of cases to be 
taught to three: 

1. Common factors of the terms of a polynomial. 

2. The differences of two squares. 

3. Trinomials of the second degree that can easily be factored by 
trial. 

The second weakness may be summed up by stating that algebra, 
as represented by current textbooks is chiefly an “‘z-y-z-a-b-c” subject. 
Sampling counts of the symbols used in 10 textbooks showed that, in 
the majority of these books, these six letters received from 50 to more 
than 90 per cent of the practice, regardless of the fact that applied 
algebra places minor emphasis on these particular letters. If algebra 
is to be made of the greatest social utility, and is not to be implicitly 
tied to a doctrine of 100 per cent transfer of training, a number of con- 
crete changes are necessary. Specifically, some of these are: 

1. The introduction of subscripts. 

2. The use of primes and other superscripts (including fractional 
exponents). 

3. Vastly increased use of the letters other than a, b, c, z, y, and z. 





1 The Reorganization of Mathematics in Secondary Education, Report by the 
National Committee on Mathematical Requirements of the Mathematical Associa- 
tion of America, 1923, p. 24. 


323 


a 
.% 
a 
it 
: 





Oe Tes 





324 The Journal of Educational Psychology 


4. More emphasis on the use of upper case (7.e., capital) letters; 
particularly the use of upper and lower case letters in the same example 
or formula. 

5. The question of the introduction of certain letters of the Greek 
alphabet should receive consideration. 

6. Experimental study of the utility of decimals as coefficients is 
demanded by virtue of the fact that these seem to be fairly common in 
applied mathematics. 

More than 90 per cent of the algebra of physics is almost a terra 
incognita to the eye of the average pupil with even a fair or better 
knowledge of the conventional algebra textbook, as a few formulas 
selected at random from a textbook in physics will show: 





v1 = v9 + at | (falling bodies) 
F =¢-> (gravitation) 
pv = p'v’ (Boyle’s Law) 
R’ R” ‘ : 

R7~X (differences in potential) 
F = vf (Coulomb’s Law) 


Data will be presented in the tables and discussion to follow which 
have been abstracted from a more comprehensive study of the proc- 
esses and abilities involved in algebraic factoring. The experimental 
results bear directly on three of the above proposals for changes in 
the teaching of factoring, v7z., 

1. The introduction of subscripts. 

2. Practice in the use of both upper and lower case letters in the 
same example or formula. 

3. The use of decimal coefficients. 

The Investigation.—In order to make clear the experimental pro- 
cedure involved in the results to be presented, it will be necessary to 
describe very briefly the scope of the larger investigation referred to 
above. In order to study, among other things, the effects of the non- 
conventional letters, subscripts, upper case letters, decimal coeffi- 
cients, etc., a series of eight tests of 20 examples each were prepared, 
paralleling the eight following cases of factoring: 

Test 1. Polynomials with a common monomial factor. Type 
form, ax + bx + cz. 











Difficulties in Factoring in Algebra 325 


Test 2. Polynomials to be factored by grouping terms and taking 
out a common binomial factor. Type form, az + ay + ba + by. 

Test 3. Trinomials which are perfect squares, the square of the 
sum of two quantities. Type form, x? + 2ry + y?. 

Test 4. Trinomials which are perfect squares, the square of the 
difference of two quantities. Type form, 2? — 2ry + y?. 

Test 5. Binomials which are the differences of two squares. Type 
form, z* — y?. 

Test 6. Quadratic trinomials of the type, x? + bx +c. 

Test 7. General quadratic trinomials of the type, ax? + br + c. 

Test 8. Binomials which are either the sum or the difference of two 
cubes. Type form, z* + y’. 

There was also a ninth test made up of the fifth, tenth, and twenti- 
eth examples from the eight tests. This was used as a control test on 
the effect of segregated vs. mixed presentation of the examples. (It 
might be stated in passing that the mixed presentation proved to be 
slightly more difficult, the average difference in the percentages of 
failures being approximately 4 per cent). Six hundred pupils were 
used in the experiments, although the schools were instructed to omit 
any test covering a type of factoring not taught in those schools. This 
reduced the numbers considerably in the case of Tests 2, 7, and 8. 
The exact numbers are stated in Table I. 

Of the total of 160 examples, 32 or exactly one-fifth, were failed by 50 
per cent or more of the pupils. These 32 examples are given in Table 
I, and are to be thought of as typical of the types of examples on which 
pupil mastery was not attained. It should be borne in mind, however, 
that many of these examples represent types that should not be taught at 
all, least of all mastered. It is interesting to note the relatively small 
percentage (20 per cent) of examples which proved to be too difficult 
for the average pupil (upon a basis of 50 per cent or more failures). 

Study of Table I shows the following facts which bear upon our 
discussion: 

1. Of the three examples involving subscripts (in the list of 160), 
all three classify in the too-difficult list (marked “‘S”’). 

2. Of the seven examples involving decimal coefficients, four 
classify in the too-difficult list (marked ‘‘D’’). 

3. Of the three examples involving upper and lower case letters, 
one classifies in the too-difficult list (marked ‘‘U’’). 

The significance of these facts for our discussion will become more 
apparent when Tables II, III, and IV areexamined. These show the 


— 





% 
x 
a 
3 
: 
4 
ee 
ne 


gett 








326 


The Journal of Educational Psychology 





relative difficulties of these same examples in comparison with closely 
similar ones selected from the lists covering the same cases of factoring. 


TaBLe I.—Tue 32 Facrorinc EXAMPLes (oF THE List oF 160 Exampies) Wuicu 
WereE FarLep By 50 Per CENT oR More OF THE PUPILS 











b>-wadad Population Example nie: ~ 

1 467 (D) .33f? + .34ft + .08¢? 78.4 
2 601 (2x + 5y)? — (a + b)? 74.4 
3 467 3(z — y)? + 7(x — y)z — 62? 74.3 
4 467 gc? — ck 54+ 2k? 74.1 
5 599 (f + 8s)? + 32(f + s) + 240 71.3 
6 601 (S) 4hbi + WAde 70.4 

7 601 (D) .04a? + .12ab + .09b? 70.2 
8 601 O(x + y)* + 122(x + y) + 42? 63.2 
9 601 (3 + a)? — 2b(3 + a) + Db? 62.7 
10 256 (S) 23 + 2° 62.1 
11 599 (D) d? — .5d + .06 60.8 
12 256 xy*z® — 216 60.2 
13 601 (D) Ax — By 59.9 
14 467 12t? + 31st — 15s? 59.7 
15 599 s? + 34rs + rt 58.4 
16 256 1+ kre 58.2 
17 601 (x? — y)? — x 57.7 
18 601 (6 —j)?+4(6 —j) +4 57.4 
19 346 6ms — 15nt + 9ns — 10mt 56.9 
20 256 ' gi — p 56.6 
21 601 (S) Wi, + W;? 56.2 
22 346 xz? — 4a%z? — y? + 4a*y? 55.5 
23 256 a® + b® 54.7 
24 256 (U) S* +- 8s? 54.7 
25 346 a—1-+ a’ — a? 54.6 
26 256 1 — a*w® 53.9 
27 346 3mn + 6m? — 2am — an 53.5 
28 467 (a + b)? + 5(a + b) — 24 52.5 
29 467 15 + 372? — 82‘ 52.0 
30 467 22? += -5 52.0 
31 599 zw? — %2+h 50.0 
32 256 8v? + 27p? 50.4 




















Difficulties in Factoring in Algebra 


327 


‘TaBLE II.—CoMPaARISONS OF THE THREE EXAMPLES INVOLVING SUBSCRIPTS WITH 
SELECTED ExaMPpLes SIMILAR IN THE SENSE THAT THEY CLASSIFY IN THE 


SAME CasE OF FACTORING. 


AGE OF Pupits Faimtinc ARE GIVEN 


Tue DIFFICULTIES IN TERMS OF PERCENT- 











Subscript examples Percentage | Selected examples from | Percentage 

failing same case of factoring failing 

Yehbi + Whb2 70.4 5a + 5b 47 
Wi+ W;? 56.2 3s — 12z 6.7 
ar? + arh 9.8 

ut + hegt? 13.3 

9y? — 6y* 14.5 

x3 + 233 62.1 a? + b3 31.6 
ited 37.9 

alte i. 38.3 

dial 38.3 

1 +m? 38.7 














TaBLE III.—ComparRIsONS OF THE SEVEN EXAMPLES INVOLVING DECIMAL 
COEFFICIENTS WITH SELECTED EXAMPLES SIMILAR IN THE SENSE THAT THEY 


CLASSIFY IN THE SAME CASE OF FACTORING. 


Tue DIFFICULTIES IN 


TERMS OF PERCENTAGE OF Pupits FaILiInc ARE GIVEN 




















. , Percentage | Selected examples from | Percentage 
Decimal coefficients failing same case of factoring failing 
Ar — By 59.9 mz — my 5.0 
4rm? — l6ry? 18.0 
35 — 12z 6.7 
.04a? + .12ab + .09b? 70.2 xz? + 2ry + y? 3.8 
4627 + 4ry + 9y? 19.6 
2 + 62" +9 16.1 
.25 —a+a? 31.3 at’—a+\ 16.6 
K? — 1.2K + .36 33.3 36r%s? — S4rstw + 49t?w? 11.0 
100 — 20f + f? 13.5 
d? — .5d + .06 60.8 r2 + llr + 24 7.4 
a? — .8a + .12 31.4 12 + 7a + a? 8.4 
m® + 12m* — 13 21.9 
.3af? + .34ft + .082? 78.4 8c? + 46c — 12 27.2 
3z? + 222 + 7 21.8 
8p? — 14p — 39 40.0 

















4 
A 
5 
4 
: 








328 The Journal of Educational Psychology 


TaBLE I1V.—CoMPARISONS OF THE THREE EXAMPLES INVOLVING UPPER AND 
Lower Case LETTERS Wi1TH SELECTED EXAMPLES SIMILAR IN THE SENSE 
Tuat Tuey CLAssiry IN THE SAME Case or Factoring. THE DiFF1- 
CULTIES IN TERMS OF PERCENTAGE OF Pupiits Farnina ARE GIVEN 





Upper and lower case | Percentage | Selected examples from | Percentage 














letters failing same case of factoring failing 
| 
Rs + sR + Sr + sr 32.7 aq~+ar+qz—rz 25.4 
pa — px —rq+rz | 44.0 
2ac + 2as + 2be + 2bs | 32.7 
81R?2 — 36r? 22.6 -9 11.5 
25 
9m* — 121n’9* 12.0 
S3 + 883 54.7 a® + B3 31.6 
a? — Sf? 44.5 
K — g’ 38.7 














Summary and Conclusions.—Study of Tables I to IV leads to a 
number of important conclusions, as follows: 

1. Only a comparatively small percentage (20 per cent) of the 
factoring called for in the average textbook in algebra is beyond the 
ability of the average Grade IX pupil. Of the types of factoring 
proving too difficult, a great many may safely be omitted from algebra 
courses upon a criterion of social utility (Table I). 

2. It is imperative that much practice be given on examples involv- 
ing subscripts. Even the simplest factoring with subscripts is beyond 
the ability of the average, pupil. Social utility demands familiarity 
with the use of subscripts. The same is probably true of primes 
although no data are presented here (Table II). 

3. Additional attention to the use of decimal coefficients would 
seem to be very necessary (Table III). 

4, Factors with upper (capital) and lower case letters should be 
introduced since these are common in applied mathematics (‘Table IV). 

5. The time usually devoted to the rarer and less important cases 
of factoring should be directed toward mastery of equations and formu- 
las involving subscripts, primes, decimal coefficients, upper and lower 
case letters, and possibly selected letters from the Greek alphabet. 











THE LAYCOCK TEST OF BIBLICAL INFORMATION 


SAMUEL RALPH LAYCOCK?! 


Lecturer in Philosophy, University of Alberta 


(A) Purpose of the Scale-——Despite the rapid advances in educa- 
tional measurement and the extension of the principles of this science 
to most problems of education, there have been thus far only a few 
random and scattered investigations of the results of religious educa- 
tion. This has been due in part to the popular belief that religious 
development and the formation of character, or anything pertaining 
thereto cannot in the very nature of things be measured. In part also 
it has been due to the scarcity of trained research workers among our 
religious educators, but without any doubt one of the major reasons 
has been a realization of the complexity and extent of the problems, 
and of the unsatisfactory results thus far obtainable from all attempts 
at character analysis. The writer is aware of these difficulties but 
does not consider that one is justified in considering them insuperable. 
Very difficult tasks have been undertaken in the measurement of 
intelligence and have been carried to a fairly satisfactory conclusion. 

It has seemed advisable therefore to approach the study from a 
slightly different angle than any hitherto attempted, and to make an 
experimental investigation of the nature and extent of the Biblical 
information possessed by children. That some such knowledge is 
imperative is the contention of all religious educators, and this claim, 
though lacking experimental demonstration, is yet seemingly supported 
by observations and impressions. It is of considerable importance 
then that some objective evidence be available as to the extent of the 
religious training a child has received, and the permanence and clarity 
of such religious knowledge as he has obtained. We are unable to 
say as yet just what significance, for moral and religious development, 
is to be attached to a greater or lesser amount of such knowledge but a 
quantitative determination of its extent seems to be a pre-requisite to 
any attempt at the solution of the more ultimate question of educa- 
tional policy. 

A study of the tests of Biblical information already available 
showed their inadequacy, and it was decided to construct and standard- 
ize an objective performance scale, which would afford a truer estimate 





1Summary of thesis presented in partial fulfilment of the requirements for the 
Degree of Bachelor of Education, University of Alberta, Edmonton, Alberta. 
329 


¥ 
: 
7 
f 
7 
F 
: 
. 
‘ 
, 
q 
’ 


o>. el aaa 














330 The Journal of Educational Psychology 


of the relative knowledge of biblical facts by pupils of different ages, 
creeds, etc. For various reasons it was decided to limit the study to 
adolescent children, and the test as described below is for children from 
twelve to sixteen years of age. 

(B) Construction of Scale.—Religious educators are in agreement 
that certain passages of the Bible are of primary importance; as for 
example, the Lord’s Prayer, the Twenty-third Psalm, the Beatitudes, 
the Ten Commandments and the thirteenth chapter of I. Corin- 
thians. The names of the books of the Bible, facts about the life of 
Jesus and his disciples, the life of Paul, and important events and 
personages in the Old Testament are also considered as basic. The 
writer selected such facts as the common experience of religious edu- 
cators seemed to warrant including in any adequate test. An attempt 
was made to avoid controversial matters and those on which scholar- 
ship has not a fairly definite opinion to offer. 

Preliminary investigations revealed the need of much revision, but 
the test was finally published by the University of Alberta under the 
title of ‘‘Laycock Test of Biblical Information.”’” A manual of direc- 
tions and a scoring key were also prepared. The test took the form 
of a recognition test, one now familiar to all of us. 

It cannot be too strongly emphasized that the test is not an intelli- 
gence test nor is it claimed that it is a test of religious development. 
It aims to test one thing only—the Biblical information of adolescent 
boys and girls. | 

A few sample questions will serve to illustrate the nature of the 
scale. 


Test I 


Draw a line under the one word that makes the sentence true as shown in the 
sample: 
Sample: Jesus was a Greek, Egyptian, Jew, Roman. 


Test II 


In the following questions several passages from the Bible are indicated. One 
of the four lines under each heading is taken from the passage. The other three 
arenot. Put across opposite the correct one as shown in the sample. 

Sample: The Lord’s Prayer contains: 

1. Ask and ye shall receive. 
X 2. Thy kingdom come. 
3. Lord, increase our faith. 
4. Create in me a clean heart, O God! 








The Laycock Test 331 


Test III 


If what a sentence says is true draw a line under the word “True.” If whata 
sentence says is false draw a line under the word “‘ False”’ as shown in the sample. 
Sample: 


1. Paul led the children of Israel out of Egypt. True, False. 
2. Jesus visited the temple when twelve years of age. True, False. 


Test IV 


One of the four statements made under each heading is what the passage 
teaches. Select the correct statement and mark it as shown in the sample. 
Sample: One of the Petitions of the Lord’s Prayer teaches to pray that: 


1. We gain great power. 
2. We gain great wealth. 

X 3. The will of God be done on earth. 
4. We gain great fame. 


Test VI 


Put a cross opposite the statement that correctly completes the sentence as 
shown in the sample: 


Sample: The Sacrament of the Lord’s Supper is in commemoration of: 


1. The feeding of the five thousand. 
X 2. The Last Supper. 

3. The Wedding at Cana of Galilee. 

4. The Baptism of Jesus. 


There are seven tests in all, containing a total of 100 questions. The method of 
scoring is to allow one point for each correct answer and to deduct one-half point 
for each incorrect answer. This policy was based on the principle of probability, 
an explanation of which is unnecessary here. 


(C) The Investigation.—For purposes of a preliminary standardiza- 
tion the test was given to Grades VII, VIII, and IX, in several cities 
and towns of five of the provinces of Canada. These cases were taken 
as constituting a random sample. The test was given in three prov- 
inces under the personal supervision of the writer and in the other 
two by trained experimenters. In all 1115 results were retained, the 
results of all pupils below the age of 11 years, 6 months and above the 
_ age of 16 years, 5 months having been discarded. Pupils between 
the ages of 11 years 6 months, and 12 years 5 months are classed as 12- 
year-olds; pupils between the ages of 12 years 6 months, and 13 years 
5 months are classed as 13-year-olds, andsoon. The writer attempted 
to work out a correlation between the extent of the Biblical knowledge 
of a child, and the period of religious instruction. This project was 
abandoned, owing to the lack of definiteness of the latter unit. 






ea, 


* 
% 
: 
% 
. 


= 


i, Se 
See eee 


a 


" te ‘ 
teat ee ee a ee ._—-— =  o- 


WES FES 


ey 











332 The Journal of Educational Psychology 


It is very probable that further investigations will require us to 
change the norms and perhaps may cause us to modify some of our 
conclusions, but the following may be taken as tentative. 

(D) Results and Conclusions. 


1. Norms for whole of Canada. 


CasEs AM SD PE 
1115 29.3 16.6 11.97 


The arithmetic mean would seem to suggest that the boys and 
girls of our nation have obtained no adequate knowledge of even 
elementary facts about the Bible. The high standard deviation indi- 
cates a great unevenness in such information. 

2. Age norms for Canada. (Vide Graph.) 


Caszs Sex 12 Years 13 Years 14 Years 15 Years’ 16 Years 
551 Boys 30.56 29.7 28.72 26.62 25.5 
564 Girls 29 .36 31.23 32.38 29.01 25.15 


With the boys there is a continual decrease in the arithmetic mean 
beyond 12 years of age. With the girls this is not quite the case, 
though even here the arithmetic mean 

















” / Graph I. for 16-year-olds is much lower than 
S . a ‘\ for 12-year-olds. There are probably 
ul Pd \ several factors responsible for this 
) Yh: a i" result. The predominating one is 
PP ep ae \. probably the fact that the results 
Sy A were taken from pupils of Grades VII, 
Lay ‘GS VIII, and IX only. Pupils of 16 
* \ | years in these grades are retarded 
Fa 7” pupils. The cause of retardation is a 
Y | much debated question, but the out- 
Sa standing cause is undoubtedly mental 


SYveERRs © is inferiority. Similarly any pupil of 12 
Arithmetic Meon Ages 12-16 years Of age in these grades is an 
te accelerated pupil—due in almost all 

cases to superior intelligence. There 
is, therefore, 2 more highly selected group of 12-year-olds than of 
13-year-olds, and so on. There can be no doubt that superior 
intelligence functions in the Sunday School as well as in the public 
school. The relation of intelligence to character is another matter, 
but numerous investigations indicate that the correlation between the 
two is a significant one. 





Gie ip --— « -<——— 








The Laycock Test 333 


The factor of interest probably enters also. Prepubescent and 
pubescent pupils are still attending Sunday School in considerable 
numbers and are interested in its activities. The majority of post- 
pubescent pupils or pupils in the later stages of pubescence are not in 
Sunday School, and other interests are dominant. 

3. Sex norms for Canada. (Vide Graph 2.) 


Cases Sex AM SD 
551 Male 28 .43 15.65 
564 Female 30.204 17.15 


The girls, therefore, rank slightly higher in the test than the boys. 
They are probably in the Sunday School more regularly and remain 
there later in life than the boys. 
































fo 
65 
to 
Wss 
a SEX DISTRIBUTION 
eys~ 
Ss Glelsq--—- —-o-ne 
= 
Le 
Oss 
or 
re 
20 
1§ 
5 
a 10 SiwwesHe 3S 6 8 BD 3s BC 6S 47> = 80 ei ee 400 
SCORE 
4. Provincial norms. (Vide Graph 3.) 

PROVINCE Cases AM SD PE 
EES Ce ae 342 26 .87 14.65 9.88 
Saskatchewan............. 116 25.172 14.6 9.88 
A eR te) 105 25.05 11.4 7.69 
New Brunswick........... 223 27 .69 18.8 12.68 
Nc siuadiséoxenetee<a 202 43.343 16.3 10.994 


These results are among the most instructive we have. The 
arithmetic mean of the four provinces where religion is not taught in 


PRR 


Pela. % 


i 


Pt 


le et Me oe 














334 The Journal of Educational Psychology 


the public schools falls within a range of two points. The arithmetic 
mean of Quebec is more than 50 per cent higher than any of the other 
provinces. While one would hesitate to draw any very definite 
conclusions from a limited number of cases, yet the results would seem 
to indicate that this marked difference is to be attributed to the fact 
that the Bible is taught in the schools of Quebec. 

The investigation as a whole brings out the fact that the children 
we have examined, on reaching an age at which many leave Sunday 
School have a very inadequate knowledge of the Bible. If it be 
























































p—- 
G h. 

‘i roph. I 

35 

Zz 

L350 

uJ 

Das 

VW 

}20 

ud 

> '5 

Ir 

to 

ia 

<5 

e A|BIC.| DIE. F. 

PROVINCIAL NAT i ONAL 

A ALBERTA. D. QUEBEC 
B. SASKATCHEWAN. —E NEW-BRUNSWICK. 
C. ONTARIO. F. CANADA. 


granted that this group is typical of Canadian childhood, and if the 
belief of religious educators be well founded—that it is important that 
children should know the more significant facts of the Bible if they are 
to build Christian characters—then some more effective means of 
providing knowledge must be secured. Present methods of religious 
teaching, at least so far as Protestant denominations go, seem to be 
securing paltry results. 

It is suggested that the test may serve the following purposes: 

1. To measure the performance of a class. 

2. To compare the performance of a class at different dates. 

3. To discover individual differences within a class. 

4. To discover special defects in the knowledge of a class. 








A RE-EXAMINATION OF A SOCIALLY COMPOSITE 
GROUP WITH BINET AND WITH 
PERFORMANCE TESTS 


J. F. DASHIELL AND W. D. GLENN 
The University of North Carolina 


In 1916-17 a study was made of children in the schools of Chapel 
Hill, N. C., by applying the Stanford Revision of the Binet tests.1 Of 
the children examined those were selected for comparative study who 
fell between the ages of 9 years 0 months and 12 years 11 months, a 
total of 77 cases. Most of these were in Grades IV, V, and VI. The 
results at once showed a distribution of scores differing from the 
“normal,” and investigation of the social and economic classes from 
which the children came brought to light an interesting correlation. 
The children of this school came from three economic classes: Families 
of members of the faculty of the University of North Carolina; families 
resident in the town of Chapel Hill, the fathers being small tradesmen, 
artisans, teamsters, etc.; and families resident on farms in the country 
outside Chapel Hill. The median IQ of the ‘‘faculty” group of 
children was found to be highest in these tests, that of the “town” 
group distinctly lower, and that of the “country” group lower still. 
The authors, in their discussion of these results, seek explanation in 
terms of heredity as the main determining factor. Environmental 
differences between faculty and town children were plainly much 
less here than between groups used in other surveys made at Columbus 
and at Cambridge; and, in fact, one environmental difference, that 
between town and country life, was considered as being based further 
back upon selective factors working upon native differences. 

The writers of the present paper were interested in amplification 
of the original study by making re-examinations on the same and 
another basis four years later. The subjects of this later study were 
different individual persons from those of the earlier but were taken 
from the same social classes and to a certain extent, indeed, from the 
same families. | 

Stanford Revision Examination.—The first aim of the present 
writers was to duplicate in detail the procedure of the earlier study, and 
thus to throw further light upon the validity of the results. 





1H. W. Chase and C. C. Carpenter: Response of a Composite Group to the 
Stanford Revision of the Binet-Simon Tests. Journal of Educational Psychology, 
Vol. X, pp. 179-188. 


335 











336 The Journal of Educational Psychology 


Owing to disturbed school conditions fewer children were this time 
available, the final list numbering 14 children selected within the age 
limits above (9 years 0 months and 12 years 11 months) from each of 
three groups, “faculty,” “town,” and “country.” In making this 
selection, for each social group, four children were chosen at random 
from those aged between 9 years 0 months and 9 years 11 months, four 
from those 10 years 0 months to 10 years 11 months, four from those 
11 years 0 months to 11 years 11 months, and two from those 12 years 
0 months to 12 years 11 months. Within these limits the choice was 
entirely by chance, and no attention was paid to school grades in 
which the children were to be found. The average chronological age 
of each group as selected turned out to be: Faculty, 10 years 7 months; 
town, 10 years 10 months; and country, 10 years 11 months. The 
corresponding average school grades were: Faculty, 5.0; town, 4.4; 
country, 4.0. The number of children in each group retarded in 
grade was: Faculty, 2; town, 5; country, 10. 

The results are presented in Table I in terms of the IQ and the per- 
centage of each group falling in each division of the IQ distribution 
(second, third, and fourth columns). The distribution of the group 
as a whole found in the present study is shown in the fifth column, 
that found in the earlier study in the sixth, and Terman’s results in 
the last. 

A point of some interest is the closer approach of the distribution 
of measures in the present study to that given by Terman, it being his 
distribution of IQ’s of 105 unselected children 5 to 14 years of age, 
the town group approximating the latter. This difference in distribu- 
tion may be due in some part to a different result obtained by the ran- 
dom selection method. However, such factors of chance selection 
would seem hardly large enough to account for such a definite differ- 
ence in groups numbering as many as 65 and 42 individuals, 
respectively. | 

The principal interest is in the test differentiation of social classes. 
In the earlier study the median IQ of the faculty group was found to 
be 115.25; that of the town group, 92.3; of the country group, 81.5. 
The median IQ’s of this study were respectively: 112.0, 99.5, and 94.5, 
with some overlapping of the last two groups, as found by the average 
deviations. Two things are apparent. The results with the later 
groups show less wide distribution in the median scores. They show, 
however, the same order of magnitude of the medians for the three 





1 “Measurement of Intelligence,” p. 66. 





~- ODO mw DRO Ss =~ Dred & 


we mw 


~ 


oe wa.lUc lf 





Re-examination of a Socially Composite Group 337 


socia. csses. The distribution of individuals of these different groups 
is show: Table II. It is clear that the children from these three 
classes 1) .ue Chapel Hill School fall into three distinct intelligence 
groups, as measured by the Stanford Revision Tests. 

The question that psychologists have conscientiously kept before 
them concerning the results of tests of the Binet type is the degree to 
which the literacy of the examined individual may influence his score. 
To check this factor of literacy, then, it was decided to re-examine the 
same individual children by means of a scale involving a minimum or, 
if necessary, none of the language functions. 

Pintner-Paterson Performance Examination.—Use was made of 
tests selected from the Pintner-Paterson series,! those chosen being: 
Paterson Five Figure Form Board, Healy Construction Puzzle A, 
Knox-Kempf Feature Profile Test, Healy Picture Completion Test, 
Woodworth-Wells Substitution Test, and the Knox-Pintner Cube 
Test. Limitations of time called for the selection of fewer tests than 


given in either the short or the long Pintner-Paterson Scales. This | 


selection was made on two bases. Variety was sought for, it being 
thought inadvisable to use, say, four form board tests if other kinds of 
problems were to be had. In the second place, only those tests were 
considered for which Pintner and Paterson had gotten results continuing 
to show differentiation of individuals at the year nine and older. For 
instance, the Mare and Foal Test was not usable for this experiment 
because the data presented by those authors showed practically no 
improvement in solution of it by children of increasing ages beyond 
nine years, whereas the Healy Puzzle A showed improvement with age 
up to 13. 

The instructions for the administration of the tests selected, as 
given in the books of Pintner and Paterson and of Pintner and Ander- 
son,” were followed in detail. Scoring was done with a printed sheet 
that was an abbreviation of that shown in the former.* The Pintner- 
Paterson table of credits* was used to determine the weighting of the 
raw scores obtained; and the total credits or points used as the final 
rating of the individual. 





1 Rudolph Pintner and D. G. Paterson: ‘‘A Scale of Performance Tests,’ 
N. Y., 1917. 

2 Ibid. and Rudolph Pintner and M. M. Anderson: ‘The Picture Completion 
Test,”’ Baltimore, 1917. 

3 Op. cit., pp. 203, 207. 

* Op. cit., pp. 176-7. 














338 The Journal of Educational Psychology 


Unfortunately, not all the subjects originally used for the Stanford- 
Binet Tests were available for this examination, there being 13, 12, 
and 9 of the original 14 in the faculty, town, and country groups 
respectively. On the basis of so small a grouping the results are, of 
course, almost negligible, but are here presented for whatever they 
may be worth. Table III gives the number of individuals within 
each division of the point score distribution. Inspection reveals the 
fact that the country children who took the performance tests did 
somewhat better than the town children who did so, but remained 
distinctly below the level of the faculty children. This is shown more 
clearly by the average scores made by the three groups; they were, 
respectively, 110.9, 103.3, 155.8. Again, taking 124 as the median 
score of the whole composite set of 34 children, we find 44 per cent of 
the country group reaching or exceeding this median, 3314 per cent 
of the town group, 77 per cent of the faculty group. 

Explanation of the shift in position of the country and town groups 
with reference to each other we would seek in two different directions: 
The faulty selection of the particular 9 of the 14 country children or 
the particular 12 of the 14 town children, or both; or else, a difference 
in the nature of the two kinds of tests as respects the exact traits 
measured. 

To take care of the first point Table IV is given, showing results 
in both examinations for each group of subjects taking both. Since 
the relative positions of the groups so far as the Binet Test is con- 
cerned remains practically unaltered by the selection of those individu- 
als who took also the Performance Test, it follows that the shift in 
position on the two tests by the country and town groups is not due to 
this factor of selection. 

There remains the possibility that the two tests do not measure 
intelligence in exactly the same form. Further, it is possible that the 
capacities of the country and town children are differentiated not by 
fundamental nature but by habit and acquaintance, 7.e., previous 
experience with the types of things called for in the examinations used 
—so that the former were at least as well and perhaps better prepared 
to attack manipulation problems while the latter were better prepared 
to handle problems involving language and number concepts, social 
situations, etc. As for the faculty group, it is clear that here is a 
distinctly abler type of child. : 

A survey of the population sampled by the selection of the children 
used in this study offers some support. The town and country popu- 








Re-examination of a Socially Composite Group 339 


lations are largely derived from a common stock that has been native 
in the State and even in the locality for several generations; and much 
intermarriage has served to interrelate, them still further. What 
difference there is is mainly occupational, the townsmen being trades- 
men and artisans on a small scale, the countrymen being farmers on an 
equally simple scale. That there may have been an amount of draft- 
ing off of the more intelligent from the economically poor countryside 
is true: But it is suggested that the scale of business operations in the 
town is not such as to put a premium on intelligence and so to draw it 
in that direction. On the other hand, the faculty population has been, 
in the nature of the case, assembled largely because of considerations 
of native ability; and, moreover, a high percentage was drawn from 
without the State and from widely separate sections of the nation. 

A generalization would suggest itself: Studies of differences in 
population intelligence to be found along lines of social and economic 
cleavage should include adequate recognition of and provision for the 
contributions of environment and acquired equipment in the “intelli- 
gence” of the human beings measured; more specifically, studies of 
the problem by use of variations of the Binet Scale only, as those 
made in Cambridge, Columbus, Columbia, 8. C., etc., should be 
checked by the employment of: intelligence tests involving some 
motor performances. 











TABLE I 
| cn. | OF. 

IQ Country | Town | Faculty 1921 1917 Terman 
56-65 0.0 0.0 0.0 0.0 1.5 0.33 
66-75 0.0 0.0 0.0 8 | TV. 2.3 
76-85 28.5 ee 0.0 11.9 20.0 8.6 
86-95 28.5 14.3 0.0 14.3 43.1 20.1 
96-105 42.8 50.0 21.4 35.7 23.1 33.9 

106-115 0.0 21.4 50.0 26.2 4.6 23.1 
116-125 0.0 7.1 21.4 9.5 0.0 9.0 
126-135 0.0 0.0 7.1 2.4 0.0 2.3 
136-145 0.0 0.0 0.0 0.0 0.0 0.55 
Median IQ...... 94.5 99.5 112.0 99.5 92.3 
See coninees-ies 4.5 5.0 6.5 


























ot 


a 


— 








340 The Journal of Educational Psychology 


















































TABLE II 
Country Town Faculty 
70-80 ‘‘borderline”’............ | 3 
TE ae caaec6 snes shee wee saan | 2 2 
90-110 ‘“‘average”............. .| 9 8 4 
110-120 “‘superior”.................... | 3 6 
120-140 “very superior”................| 1 4 
TABLE III 
Point scores Country Town Faculty 
30-50 0 1 0 
50-70 2 1 1 
70-90 0 2 0 
90-110 1 3 1 
110-130 4 3 1 
130-150 1 1 0 
150-170 1 0 4 
170-190 0 1 4 
190-210 0 0 2 
TaBLe IV 
| Binet (IQ) Performance (points) 
Groups 
Av. A.D. Av. A.D. 
DT cab cs0eegnenes bake peal 111.7 9.0 155.8 38.2 
MIELE « 4.5% 4s aan econaicead 6st 100.2 aon 103.2 27.1 
En cikn wane ankhae ewe eee 94.2 8.1 110.9 22.3 























RETESTS AND THE CONSTANCY OF THE IQ 


L. 8. RUGG 


Principal of the West Alexandria Grammar School 


Alexandria, La. 


This paper presents the results of 114 pairs of mental tests (or 228 
individual tests, each pair having been given at different times to the 
same child) given to children in the West Alexandria Grammar School 
by Miss Beulah Lanius and Mrs. B. M. Williams, primary teachers in 
the school. The Stanford Revision of the Binet-Simon Tests was 
used in making the tests. The testing was done under the supervision 
of the writer. The purpose of the retests was to determine the 
relative constancy of the intelligence quotient, or the amount of 
change between the first and second test results, and to determine 
the reliability of the testing which these teachers were doing in 
the school. 

The children tested were not selected. They were somewhat 
above the average in intelligence, which is accounted for by the fact 
that the school (which is a public school) is located in one of the best 
residential sections of the city, and by the further fact that a larger 
number of children of low than of average or of high IQ who were given 
the first tests, moved from the district before the retests were given. 
The range of the IQ for the 114 cases was from 73 to 133 on the first 
tests, the median being 106.5. The time which intervened between 
_ the two tests ranged from 4 to 36 months, the median range being 21 
months. 

The data for the 114 pairs of tests were tabulated in detail, showing 
the date of each test, the chronological age, the mental age, and the 
intelligence quotient of each of the 114 children at the time each test 
was given. The facts shown below were derived from a study of 
these data. 


1. Coefficient of correlation (Pearson)....................45. .948 + .006 
2. Average difference in IQ between first and second tests (all 
rt i ee Sas s.5 ee Robees-bd Kee bees eR 3.1 
sds becnckswcus seb awadeexeee 1.9 
onc céccwcesecvaserieesessescdeses 1.2 


3. Distribution according to age-groups 
341 








4 


i 








342 The Journal of Educational Psychology 








Interval oe ” Average difference 
5-Oto 5-11 12 3.17 
6-0 to 6-11 55 2.9 
7-0 to 7-11 16 3.5 
8-0 to 8-11 16 3.75 
9-0 to 9-11 10 3.6 
10-0 to 10-11 3 2.0 
11-0 to 11-11 1 9.0 
15-0 to 15-11 1 0.0 











4. Distribution according to time elapsing between the tests (three time 
groups) 














Time interval between tests Number of cases a "ie 
as sn ow ek eee ee 12 2.17 
ik wens dav en edwacel §2 2.92 
25 months, or above................. : 40 3.67 





The coefficient of correlation was very high (.948) with a probable 
error of only .006. The average increase in the IQ for the second test 
was slightly more than the average decrease. The average difference 
in IQ between the two tests was rather small, being only 3.1 points. 
The difference between the changes in the IQ for the different age 
groups is too slight to be at all significant. The timeelapsing between 
the tests seemed to influence the results but slightly, the difference 
becoming slightly greater as the interval of time between the tests 
increased. The greatest change in the IQ for an individual case was 
11 points. These facts show that the IQ’s for the 114 cases retested 
remained relatively constant; or in other terms, the facts show that 
the results of two tests given to the same pupil at different times are 
approximately the same. Furthermore, the results of the retests 
have supplied us with evidence that the testing done in this school is 
reliable to a high degree. Knowing that our work in testing is reliable, 
we are “carrying on”’ with it. 

Our results are in substantial agreement with the findings of other 
investigators, as is evidenced by the following figures compiled by 








Retests and Constancy of the IQ 343 


Dickson and shown in Note 3, page 66, of his book entitled ‘“‘ Mental 


Tests and the Classroom Teacher.” 


Dickson’s Compilation: 
RR ee ee re 
Rugg and Colloton............... 
i a aos 
si os ohh on & oie eek 4k 
RI 28 PETS IS aR aa me 
Nc ives ce hake here tie 
ie iis dic 5. dw ade aaa 


.85 (31 cases) 
Cuneo and Terman; .94 (21 cases) 
.95 (25 cases) 


.72 (274 cases) 
.84 (137 cases) 


.93 (435 cases) 

.72 to .93 (for various groups) 
.84 (44 cases) 

.95 

.82 


.88 (298 cases, 1 year’s interval) 


Garrison ; .91 (127 cases, 2 year’s interval) 





.83 (42 cases, 3 year’s interval) 


Dickson also calls attention to 435 retests made by Dr. Terman, 
resulting in the high correlation of .93; and to two other studies made 
by Terman, resulting in correlations of .94 and .95, respectively. 
Clearly the obtainable data point to the conclusion that the IQ remains 
relatively constant. 

















NOTES ON ARTICLES IN EDUCATIONAL 
PSYCHOLOGY IN CURRENT ISSUES OF 


me OTHER MAGAZINES lm 


REPORTED BY C. 0. MATHEWS 











INTELLIGENCE TESTING 


A Group Intelligence Examination without Prepared Blanks. J. Crosby Chap- 
man. Journal of Educational Research, 1925, April, pp. 269-279. Revised form 
of this test for Grades VI to XII with data showing reliability, validity and norms. 

The Value of Photographs and Handwriting in Estimating Intelligence. Kather- 
ine T. Omwake. Public Personnel Studies, 1925, Jan., pp. 2-15. An extensive 
investigation concluding that photographs and specimen of handwriting are of no 
value in estimating intelligence. 

What is Intelligence? Frank N. Freeman. The School Review, 1925, April, 
pp. 253-263. A discussion of the nature of intelligence offering a definition. 

A Study of Re-tests. S. C. Garrison and M. 8. Robinson. Journal of Educa- 
tional Research, 1925, Mar., pp. 190-196. A comparison of results on the Stan- 
ford-Binet and the National Intelligence Test on the basis of scores secured 
by re-tests. 

Ability Grouping of Junior High School Pupils in Cleveland: Some Practical 
Aspects of the Problem. Frank G. Pickell. Journal of Educational Research, 
1925, April, pp. 244-253. A consideration of problems arising from classification 
by means of intelligence tests. 


ACHIEVEMENT TESTING 


Standardized Tests in Bacteriology. Oscar B. Hunter and F. A. Moss. Public 
Personnel Studies, 1925, Feb., pp. 52-66. Four standardized tests in bacteriology 
with norms. 

The Influénce of Standardized Tests on the Curriculum in Arithmetic. Clifford 
B. Upton. Teachers College Record, April, 1925, pp. 527-641. A criticism of 
some of the features of existing arithmetic tests. 

Scientific Study of Instruction. Harlan C. Hines. School and Society, 1925, 
Mar. 14, pp. 303-307. A discussion of the measurement movement as it affects 
instruction. 

A Study of Intelligence and of the Training of Teachers as Factors Conditioning 
the Achievement of Pupils, I. J. M. Hughes. The School Review, 1925, Mar., 
pp. 191-200. A study of the intelligence and achievement of high school pupils 
in physics. 

Observations on Factors Determining Success in Physics. Archer W. Hurd. 
School Science and Mathematics, 1925, Mar., pp. 259-266. Intelligence tests, 
an achievement test and the AQ technique are used. 

344 











Articles in Other Magazines 345 


LEARNING AND PsycHoLoacy or ScHoo.L SuBJECTS 


Conditioning and Unconditioning Emotions in Infants. Mary Cover Jones. 
Childhood Education, 1925, Mar., pp. 317-322. Shows how responses are learned 
and unlearned in the case of emotions. 

Reading Difficulties in Arithmetical Computation. W. E. Lessenger. Journal 
of Educational Research, 1925, April, pp. 287-291. ‘Errors due to faulty reading 
virtually disappeared as a result of training in reading without any specific reference 
to arithmetic.” 

How Shall Subtraction Be Taught? F. B. Knight, G. M. Ruch, and O. S. Lutes. 
Journal of Educational Research, 1925, Mar., pp. 157-168. Some principles of 
learning underlying the teaching of subtraction by different methods. 

The Psychology of School Subjects. Frank N. Freeman. School and Society, 
1925, Mar. 21, pp. 337-342. A paper defining this field and showing the need of 
research in it. 

Time-expressions Comprehended by Children of the Elementary School I. Mary 
G. Kelty. The Elementary School Journal, 1925, Mar., pp. 522-528. An analysis 
of literature to find what time-expressions children of first three grades are sup- 
posed to know. ‘Test results in a subsequent article. 

Problems in Beginning Reading. Arthur I. Gates, assisted by Esther Hemke 
and Dorothy Van Alstyne. Teachers College Record, 1925, Mar., pp. 572-591. 
Differences in practices now prevalent in teaching beginning reading with some 
psychological implications. 

A Study of Children’s Choices of Reading Materials. Emma B. Grant and 
Margaret L. White. Teachers College Record, April, 1925, pp. 671-678. Results 
of library and classroom investigations and a comparison of these results with the 
content of 15 school readers. 

Children’s Conceptions of Radicalisms. Hyman Meltzer. School and Society, 
1925, Mar. 28, pp. 390-392. Individual examinations were given to 302 children 
to determine their grasp of certain concepts. 


CHARACTER AND PERSONALITY 


Objective Methods of Measuring Character. Mark A. May and Hugh Harts- 
horne. The Pedagogical Seminary and Journal of Genetic Psychology, 1925, 
Mar., pp. 45-67. Complete bibliography of all objective measures in this field 
with considerations as to their validity, realiability, scoring and norms. 

Success, Personality, and Intelligence. W. W. Charters. Journal of Educa- 
tional Research, 1925, Mar., pp. 169-176. The influence of personality traits 
on success in school and in occupations. 

Ideals, Situations, and Trait Actions, II. W. W. Charters. The Elementary 
School Journal, 1925, Mar., pp. 507-517. An analysis of situations involving 
courteous trait actions. 

Utilizing the Results of the Downey Individual Will-temperament Test in Pupil 
Administration. W. C. Reavis. The School Review, 1925, Mar., pp. 174-183. 
Three case studies in the University of Chicago High School. 

The Will-temperament of Upper-grade and High-school Pupils. Arthur E. 
Traxler. The School Review, 1925, April, pp. 264-273. Results of pupils in 
four small high schools on the Downey Group Test. 


| 
74 


o be x 
eS ey ee 


oe ca Pe 
- - 


me 
ro ~~ 





~~ 


=>. 











346 The Journal of Educational Psychology 


Report on a Questionnaire Study of Personality Traits with a College Graduate 
Group. F. L. Wells. Mental Hygiene, 1925, Jan., pp. 113-127. Summary of 
the questionnaire, the replies, and the interrelationships of certain items. 

The Measurement of Fundamental Character Traits by a New Diagnostic Test. 
Ronald C. Travis. The Journal of Abnormal Psychology and Social Psychology, 
1925, Jan.—Mar., pp. 400-420. An attempt to get objective measures of 50 traits 
by setting up statements involving personal attitudes, mental sets, etc. 


MISCELLANEOUS 


An Analytical Study of 120 Superior Children. Alice M. Jones. The Psycho- 
logical Clinic, 1925, Jan.—Feb., pp. 19-76. All but five are above 1401Q. Data 
is given relative to their intelligence achievement, nationality, physical traits, 
social and economic status. 

Scientific Methods of Studying Preschool Children. Bird T. Baldwin. School 
and Society, 1925, Mar. 21, pp. 360-362. A brief summary of some experiments 
in progress at the Iowa Child Welfare Research Station. 

An Experimental Study of the Rating Scale Technique. Sarah E. Marsh and 
F. A. C. Perrin. The Journal of Abnormal Psychology and Social Psychology, 
1925, Jan.—Mar., pp. 383-399. A study of methods of rating and a comparison 
of results with objective measures. 

A Child Who Would Not Talk. Margaret Morse Nice. The Pedagogical 
Seminary and Journal of Genetic Psychology, 1925, Mar., pp. 105-142. A case 
study of a child who was very slow in learning to talk. 











y 


_ J 
~ 


— 


1 





NEW PUBLICATIONS IN EDUCATIONAL 
PSYCHOLOGY AND RELATED FIELDS OF 


met EDUCATION _~ 











CONDUCTED BY JOHN HOCKETT 
The Lincoln School of Teachers College 


LECTURES IN EDUCATIONAL PSYCHOLOGY FOR STUDENTS AND LAYMEN 


Instinct, Intelligence and Character, by Godfrey H. Thomson. New 
York: Longmans, Green & Co., 1925. Pp. 281. 


This book is an unusual achievement. Heretofore we have been 
thinking too much of contributions in psychology as discovery and 
verification of truths; little attention has been paid to adapting technical 
psychology to the understanding and use of laymen and of beginning 
students. Often among scientists ability to teach has been contrasted 
with scholarly achievement and research. The difficulty of referring 
laymen to sound, but readable, works in psychology is evidence of this 
regrettable fact. The two points of view have been again reconciled. 
In most interesting narration and description the author has presented 
the essence of educational psychology. The subject-matter is sound; 
the style is more gripping than many works of fiction. The volume 
stands as a laudable contribution from this most able English scholar. 

The book is not a text, and is not adapted for such use. Rather, 
it is a series of lectures which were delivered to members of graduate 
classes in educational psychology, paralleling and integrating their 
preparation and understanding of text and supplementary assign- 
ments. Practically all the usual topics in educational psychology 
have been treated in a simple, concrete, and illuminating style. The 
general viewpoint of the author is shown in a final paragraph: 

“These, then appear to be the main ideas urged: The continuity 
of action and thought, the importance of originality, and the danger 
of authority in intellectual things, the need of subjecting theory to 
trial, and of seeing general methods in several subjects, the importance 
of re-directing instincts and avoiding mere repression, the value of 
play methods, and methods seizing upon the instinct of the moment 
with younger children. Throughout, education should look more 

347 














348 The Journal of Educational Psychology 


and more ahead, both in cultivating intelligence and in creating charac- 
ter. For the latter can be created, while intelligence, it would seem, 
is much more a matter of heredity. In it, individual differences seem 
more inborn than in character; and the task of the school is rather that 
of discovering than of making talent, the task of finding the shape of 
pegs, not whittling them to fit the square or the round holes. Intel- 
lectual guidance and character training.” 

Special commendation must be made of the effective inter-relation 
of instinct, intelligence and character, functions which heretofore have 
been opposed one against the other in the popular mind. The rise 
of intelligence through variation of instinctive responses and character 
training as behavior training controlled by those same methods which 
are used in controlling behavior of the cognitive and manipulative 
kind are lucidly described. Thus the integration of mental experi- 
ences is splendidly emphasized. 

The reviewer regards the book as one of the most valuable ones in 
its own field which have appeared recently. It will be inspirational 
to teachers and will be widely used for supplementary reading by 
students; it will be helpful to the layman who has little or no technical 
knowledge of psychology and also to the more advanced student. The 
latter can lay aside for the moment the dryer details and perhaps 
necessary profundity of textbooks for this no less sound and far more 
clarifying description and application of psychological truths. 

Epwin Maurice BAILor. 





MENTAL MEASUREMENTS OF YOUNG CHILDREN 


The Mental Growth of the Pre-school Child, by Arnold Gesell. New 
York: The Macmillan Co., 1925. Pp. 447. 


Clinical psychologists who have been looking for tests for the very 
young child will find this book extremely helpful. It represents the 
clinical experience of the author with pre-school children over a period 
of some six years. It discusses the principles as well as the technique 
of developmental diagnosis. Parents should also find the book useful. 

The author points out that units of time must be considered as of 
varying significance at the various age levels. A month’s variation 
in infancy may mean a year’s variation in later childhood. Conse- 
quently, his data are divided according to nine age intervals, three to 
four months, six months, nine months, one year, one and one-half 








1e 
il. 
of 
yn 


lf 


New Publications 349 


years, two years, three years, four years, and five years to six years. 
Fifty children were tested at each of these levels. 

Considerable space is given to the description of the large number 
of tests developed by the author. A helpful device is the scheme of 
letter rating according to the frequency of occurrence. A particular 
test, for instance, may be passed by from 65 per cent to 84 per cent of 
four-months-old infants and hence gets a rating of 4B. The same test 
may be passed by 85 per cent to 100 per cent of the six-months-old 
infants and hence gets a 6C rating. 

Stress is laid throughout the book on the comparative method, and 
a section is devoted to a discussion of simultaneous testing of children 
of different ages. 

Sections on developmental correspondence in twins, on the foetus, 
the neonate, and action photographs are included. An interesting 
chart showing the rate of growth of one normal and two subnormal 
children on successive examinations, all below two years of age, indi- 
cates the possibility of prediction of later intellectual status, even 
when the subject is under one year of age. The author discusses the 
possible pre-school superiority of several men of genius, but does not 
present the results of actual investigations bearing on the rate of mental 
growth in children. BetH WELLMAN. 





EXPERIMENTAL STUDIES OF YOUNG CHILDREN 


Mental Growth of Children in Relation to the Rate of Growth in Bodily 
Development, by Buford J. Johnson. New York: E. P. Dutton 
& Co., 1925. Pp. 157. 


There are two outstanding and commendatory features of this 
book, the emphasis on repeated mental and physical measurements of 
the same child for determining the rate of growth, and the attempt to 
treat the child as a unit by synthesizing these two aspects of growth. 
The book is a report of careful and extensive investigations on children 
from one year to 13 years of age, with repeated measurements on the 
same children. 

For physical development the author has used as criteria height, 
weight, blood pressure, pulse rate and the reflexes. Stanford Binet 
Tests, Pintner Paterson Performance Tests, the Rossolimo Tests and 
several tests of specific mental abilities were used to determine mental 


et eo. 


- 
Y 
a 
a 
“4. 
Bs } 
‘ = 
Ff. 
it 
t 
7 


4 ~~ Aa 
_ Enel of 
Sa es as Re 





“~ 


+> De 


i 











350 The Journal of Educational Psychology 


development. In addition to these, several psychophysical tests for 
muscular control were included. 

The author concludes that the intelligence quotient does not remain 
constant for children tested when very young, and that either mental 
growth proceeds at a more rapid rate during the first four or five 
years of life than thereafter, or the measuring methods are inadequate. 

By partial correlation technique for eliminating the influence of 
age, she finds that mental age does not show a significant relationship 
to the weight-height index, to blood pressure, or to pulse rate. Atten- 
tion should be called, however, to the fact that the children studied 
are a superior group mentally, that is, they are above the age 
norms, and that they are also above the age norms for height 
and weight, particularly in the case of girls. Correlation procedure 
with the weight-height index may cover up some real relationships, 
first, because of the doubtful validity of correlating ratios, second, 
because of the varying relationship to chronological age of the two 
factors in the weight-height index, and third, because the two sexes 
should be treated separately in the correlations. BETH WELLMAN. 





EDUCATIONAL EXPERIMENTATION IN New YORK 


Contributions to Education, Vol. I, edited by J. Carleton Bell. 
World Book Co., 1924. Pp. IX + 359. 


This volume consists of 31 separate contributions by members of 
the New York Society for the Experimental Study of Education. A 
perusal of the lists of contributors reveals the names of many educa- 
tional leaders in the public schools and in the various private institu- 
tions of New York City. The Society itself, organized in 1918, has 
a two-fold purpose: To encourage experimental studies in all fields of 
education, and to serve as a clearing house for reports and discussions 
of the many scattered experiments being carried on in the schools 
of the city. 

The present volume contains reports of the progress or comple- 
tion of several investigations by groups orindividualmembers. Impor- 
tant among these are ‘‘A Comparative Study of Group Intelligence 
Tests Applicable to Children of Kindergarten Age,” ‘ Placement 
Examinations in French,” ‘‘ Tests in Physical Education,’ and ‘‘ Topics 
in High School Mathematics.”’ 








New Publications 351 


There are also articles which deal with problems facing educational 
workers today. While theoretical or philosophical considerations are 
set forth in many of these discussions, the emphasis is upon the experi- 
mental attack. Constructive suggestions are given for future research 
and experimentation in numerous educational fields. 

The range of topics included in the chapter headings is broad; 
they are in fact more varied than can be even indicated in this review. 
The chapters are simply, clearly, and briefly written, each with a 
distinct message from the experience of the contributor. All who are 
interested in the experimental study of education will profit by the 
reports and suggestions contained in this book. All who wish to view, 
in cross section, present day educational thought and effort of varied 
aspect will enjoy reading Contributions to Education. It is the opinion 
of the reviewer that Vol. I justifies its title. J. H. 





DESCRIPTIONS OF LEARNING PROCESSES 


Skill in Work and Play, by T. H. Pear. New York: E. P. Dutton & 
Co., 1924. Pp. 107. 


This interesting little book describing certain phases of the learning 
processes is welcomed from the author now personally known to many 
through his recent visit to this continent. The aim of the book as 
stated is ‘‘to focus clearly and to describe simply the most important 
problems in the acquisition of muscular or bodily skill.” Its treatment 
appears in six chapters entitled: The Description of Muscular Experi- 
ences; Acquiring Skill; The Experimental Investigation of Learning; 
Training in Muscular Performances; The Relation between Training 
and Education; and Training of the Intermediate Ranks in Industry. 
The first two topics present the special difficulties of introspective 
description of muscular activity and the similarities between acquisi- 
tion of skill and of knowledge. Chapter III briefly summarizes the 
usual discussions of learning curves and adds interesting comments 
showing the motion studies of the Gilbreths and Frederick A. Taylor 
to be inadequate for either industry or psychology. Criticisms of 
present methods of training muscular performances are sound and 
should be helpful to laymen, but no principle new to psychologists is 
presented. The spread of training or ‘‘educational value” is explained 
under the usual headings of methods, attitudes, and ideals. Effective 
training in industry requires, according to the author, acquisition of a 


Jan SNe 


re 





oer 


° nr ata 
ot ie, ne - aie 











352 The Journal of Educational Psychology 


high degree of technical skill and clear concepts of methods whereby 
maximum spread of training can be assured. 

The style is simple and attractive and although not ambitious the 
work deserves much commendation. It reflects the special attention 
which has been given to industrial psychology in England and may 
stimulate students in this country to work in this very promising, but 
little developed, field of psychology. Epwin Maurice Bartor. 





A COMPARATIVE Stupy or Typres OF EXAMINATIONS 


New Type Examinations in the High School, by Sterling G. Brinkley. 
New York: Teachers College, Columbia University, Contribu- 
tions, No. 161, 1924. Pp. VI + 121. 


This is the report of a study of the comparative values and limita- 
tions of several of the new, relatively objective types of tests, com- 
pared with one another and with the older essay examination. The 
experimental work consisted of two parts. The first experiment 
involved the preparation by the author of 10 weekly tests, which were 
given to five parallel classes in American history in the George Wash- 
ington High School of New York City. The purpose was to compare, 
in several respects, the six types of tests used. These were: True- 
false, multiple choice, completion, word of phrase answer, arrange- 
ment of a series of items in chronological or other order, and essay. 
The second phase of the experimental work was aimed at the deter- 
mination of the effect of a limited amount of training upon the ability 
of high school teachers to construct and use effectively tests of the 
newer types in various school subjects. Additional evidence upon 
the relative values of the various types of tests was secured. 

The general conclusion of the study was, very briefly, that when 
tests requiring equal amounts of the pupils’ time were compared, the 
various types yielded results of essentially equal value. This conclu- 
sion was stated as valid whether the objective was to test general 
achievement, ability to think, or information. The true-false test 
was found to be much more comprehensive than an essay examination 
requiring equal time, and consequently of greater value in diagnosis. 
Pupils preferred tests composed of both old and new type questions. 
As would be expected, teachers preferred to construct essay tests, 
but found the objective types more satisfactory to score. 








y 


n 


it 


New Publications 353 


The subject under consideration is one of wide interest. The 
present investigation makes a valuable contribution to the discussion 
of the subject, but, due to the necessarily limited number of tests, 
pupils, school subjects, and teachers involved, leaves much to be hoped 
for in the way of future work. J. H. 





PROGRESS IN THE MEASUREMENT OF EMOTION 


The Measurement of Emotional Reactions: Researches on the Psycho- 
galvanic Reflex, by David Wechsler. New York: Archives of 
Psychology, Pp. 181. 


In the measurement of emotion psychology has, to date, made 
little progress. The experimental investigation of emotions is diffi- 
cult. Largely for this reason the researches in this field have been 
few in number. And with the exception of the more general and 
inclusive work of W. Whately Smith reported in his The Measurement 
of Emotion the investigations have been limited in scope. Additional 
researches are needed. The present contribution is, therefore, a 
timely one. 

The findings are presented in an inductive manner. The first 
few chapters furnish the reader with information required for an 
understanding of the results reported and conclusions drawn in the 
later chapters. In them are reviewed: (1) The various methods used 
in the investigation of changes described as physiological expressions 
of the emotions, such as variation of respiration, blood pressure, capil- 
lary pulse as well as the psychogalvanic reflex, (2) the known facts of 
the electrical behavior of the skin, (3) the ways of measuring the 
galvanic response. 

Of greater interest to students of psychology and education are 
the data reported in the later chapters of the book—the physiological 
factors which influence the galvanic response discussed in Chap. V, 
the psychological correlations in Chaps. VII and VIII, and the “‘ Practi- 
cal and Clinical Applications” described in Chap. IX. 

Some of the more interesting conclusions based on the findings 
reported in these researches are: 

1. The galvanic response is an index of the occurrence of an 
affective reaction in response to an exciting stimulus. The fact that 
the galvanic response can be elicited under sleep the author takes as 
evidence that the response does not measure a perceptive reaction. 





ot. Vices sere ee 


~ = lL 











354 The Journal of Educational Psychology 


2. Introspective ratings of words used as stimuli for emotional 
reactions correlate .59 and .67 with results determined by galvano- 
metric deflections. The discrepancies are ascribed by the writer to 
the subconscious affective value of certain words to certain subjects. 
This explanation he bases on the fact that, in his investigations, the 
words which gave larger galvanic responses called forth, upon further 
examination, associations of the subjective type; whereas those which 
gave smaller galvanic responses called forth associations of the objec- 
tive type. 

3. Individuals tend to be specifically emotional. This conclusion 
is based on the intercorrelations which the author discovered between 
the various types of stimuli used to evoke emotional reactions. 

4. The practical applications of the psychogalvanic reflex are, 
according to the author: (a) To differentiate hysterical anesthesias 
and analgesias from similar conditions due to organic lesions, (b) to 
differentiate between stuporous manic depressive insanity and cata- 
tonic dementia precox, (c) in study of certain types of exophthalmic 
goitre, (d) in study of effects of drugs on the nervous system. 

These researches seem to show that the practical investigation of 
the psychogalvanic reflex is possible in a number of fields. The 
technique used in these researches did not fulfil the ideal condition 
set by the author. This he met only in the end of his researches. 
But that is hardly a criticism; for what researches could not be im- 
proved if the knowledge gained from experimentation had been avail- 
able at the start? It is ever so. The result is an improvement in 
technique and an increase in our knowledge of the physiological and 
psychological factors which influence the galvanic responses of an 
individual. And that-is a useful contribution. H. MELTZER. 





A Srupy or DIFFICULTIES OF GEOMETRY 


Solving Geometric Originals, by Frank Charles Touton. New York: 
Teachers College, Columbia University, 1924. Pp. 114. 


This study, made in 1919 and published in 1924, is an attempt to 
deduce from the solutions of exercises in geometry those principles 
which should guide the teacher when directing “the thinking of pupils 
in their first attempts at the solution of new and involved geometric 
The basis of this study was a critical analysis of about 


situations.’’ 





nr 


ms ie ai. _ = =—_s o- _ a, ie 








New Publications 355 


2800 papers in Plane Geometry of the New York Regents Examination, 
June, 1918. 

The introduction consists of a list of the facts and principles in- 
volved in the proof of each of the eight exercises studied, but treated 
as separate ideas or concepts with no hint as to their interrelations. 

In order to examine the relative difficulties of the exercises, the 
author assumes ‘“‘that other factors so counteract one another that 
their combined influence would not greatly disturb the results obtained 
from the use of the mean scores and the percentages of the pupils 
selecting the exercise.”” In a recent experiment the reviewer found 
that other factors, including the interrelations mentioned above, are 
the essential parts of the proof wherein lie the various degrees of diffi- 
culty. Professor Touton also states that teachers by virtue of their 
own analyses of exercises should weigh these according to their rela- 
tive difficulty. However, teachers do not seem to agree on even the 
relative order of difficulty of exercises involving demonstration of a 
given statement; this was shown clearly in a study reported in the 
February, 1925 issue of the Mathematics Teacher, under the title, 
“Student Difficulties in Exercises in Geometry.” 

The sex differences in mean and median scores were found to be 
very slight, favoring the attainment of boys. In fact, the differences 
are so small that we wonder if their standard errors would not reveal 
them to be actually, or nearly, chance differences. The author him- 
self cautions readers in interpreting these data in that the papers 
examined included practically none of less than passing grade. Fur- 
thermore, when his data concerning the variability of the sexes con- 
flicted with the findings of other investigators, the author states the 
comparative data with considerable care, as well as his interpretation 
of the causes underlying these differences. 

Chapter IV describes the “Steps in the Thought Processes Involved 
in Solving Originals.”” The importance of the knowledge of basic 
facts and principles in relation to their probable uses in the proofs of 
exercises, of the potency of elements, of testing suggestions as they 
arise to select those pertinent for this exercise, and the final step of 
logical arrangement of the steps in the proof, are very carefully 
discussed. However, the “‘search for direct outcomes” of the given 
and required elements places the emphasis upon inductive, rather than 
the deductive reasoning of geometry. Thisinductive approach and the 
tendency to regard the given rather than the required elements as 
the guiding factors in the solution of an exercise result in a student 





Te 





356 The Journal of Educational Psychology 


technique which leads into various attempts each with apparently 
equal probability of success. On the contrary, a student technique 
based upon the deductive approach with the required relation as the 
guiding factor lessens the process of trial and error, because it is the 
means of indicating which one of the possible attempts has the greater 
probability of success. This process facilitates the formation of habits 
of reasoning more directly from the known to the unknown relations. 

Chapter V contains many suggestions of practical value for the 
classroom teacher. But, in the light of more recent investigations, 
the suggestions concerning the type of classroom participation of 
superior students as well as the extent of classroom drill of propositions 
are not advisable. 

Thus, “‘Solving Geometric Originals” is a very careful and detailed 
study of certain aspects of the results of a year of study of Plane 
Geometry by a selected group of students. It is not an investigation 
of the actual learning process of a more typical group or of individual 
students, as ‘“‘solving’’ might imply. Winona M. Perry. 





PERSONALITY ADJUSTMENTS 


Three Problem Children. Narratives from the Case Records of a Child 
Guidance Clinic. Publication No. 2. Joint Committee on 
Methods of Preventing Delinquency, 50 Hast 42 St., New York 
City. Pp. 146. | 





Even a cursory reading of these case records of Mildred, Sidney 
and Kenneth shows that types of behavior problems are exceedingly 
diverse and points the need for studying each child as an individual. 
Each of these three children represents a special combination of 
physical, mental, and social traits and his own distinct personality 
problem. In each case behavior is shown to be not a problem in 
itself but an indication of something that is wrong in the child’s life. 
The direct responsibility of the home life and the school situation is 
plainly shown and the influence of the attitude of the teacher and 
parents is made clearly evident. 

In these cases scientific personality studies were made by an organ- 
ized clinical group. The basis of the studies were physical, psycho- 
logical, psychiatric and social examinations made by skilled workers. 





tly 
jue 
the 
the 
ter 
its 
ns. 
the 
ns, 

of 
ons 


led 
une 
ion 
ual 


ald 


ork 


ley 
gly 
al. 

of 


in 
fe. 
is 


nd 


in- 
10- 
TS. 








New Publications 357 


Many teachers, parents, and others who deal with children do not 
have an opportunity to secure such skilled service. To such people 
this little book should offer many practical suggestions for wise han- 
dling of the children under their care. It will show them in actual 
situations the mistakes which teachers and parents so commonly 
make and the disastrous results of such treatment. It points out the 
far reaching effects of apparently trivial circumstances and the neces- 
sity for searching deeply for the complex causes of conduct. Chance 
remarks carefully noted and studied often throw the most light on 
the underlying causes of the whole trouble. 

Behavior difficulties often mark the beginning of a disordered 
personality which hinders normal social development and makes for 
life-long unhappiness. Carefully planned study and treatment as 
carried on in the cases discussed bring normal adjustment and greater 
social usefulness. CrEcILE COLLOTON. 





ANOTHER Book ON PERSONALITY PROBLEMS 


The Psychology of the Unadjusted School Child, by John J. B. Morgan. 
The Macmillan Co., New York, 1924. Pp. XI + 292. 


This book is intended for the guidance of parents, teachers, and 
others actively engaged in developing the personalities of children. 
The author takes the position that many types of inadequate adjust- 
ments are preventable and he attempts to state the principles of train- 
ing which will insure normal adjustments. 

The problem is attacked in the following way. Dr. Morgan 
describes in detail the various unsatisfactory ways of meeting the 
stresses imposed by reality. He then explains the influences through 
which each particular inadequate adjustment arises, and how it is 
corrected or avoided. Special emphasis is placed on “Retreat from 
Reality,” “‘Compromise with Reality,” and ‘Distortion.”’ Other 
portions of the book deal more directly with the problem of proper 
training. 

The author’s psychological theories are of the psychoanalytic 
type. He accepts the concept of a subconscious mind. He speaks 
of “‘surrendering the ego,” of “projecting the self-love”’ on external 
objects, and of ‘‘triumphing over mental conflicts.” 


Aaa “Seng ee - 











358 The Journal of Educational Psychology 


In general, it may be said that the book is extremely suggestive. 
However, its practical value is somewhat impaired by the inadequacy 
of the illustrative material. Very few typical school child disturbances 
are described and in almost no instance is the discussion of cases 
sufficiently complete. On the other hand, it is almost the only avail- 
able scientific treatment of a vital psychological problem. It is 
certainly worthy of careful study by such teachers as have some 
knowledge of the field. BNGLISH BaGBy. 

Yale University. 





A Stupy or MEMORY-SPAN ABILITY 


Some Memory Span Problems: An Analytical Study at the College- 
adult Level, by Robert A. Brotemarkle. Ph. D. Thesis, University 
of Pennsylvania, Philadelphia, 1924. Pp. 29. 


In this short study are reported the results of nine experiments 
with memory span tests on some students of the first-year course in 
psychology at the University of Pennsylvania. Digits, three-letter 
words, syllables and ideas were the types of memory span tests used. 
Six of the nine, were group experiments; three, were individual. All 
three individual experiments were for digits; one was an introspective 
study. 

Memory span, the writer reminds us, is not memory but is an 
ability itself. And “the memory span test,” he concludes, “‘is diag- 
nostic of the complexity of mental organization,” but, “the memory 
span score is not sufficient in itself; it requires a complete analysis of 
the complexity of the mental processes involved, and assurance of the 
directness of the response.”” Because of the “varying diagnostic 
values of the different types of memory span,”’ it “‘has a certain restric- 
tion placed upon its use by the intellectual levels of the individual 
being tested.” 

The write-up of this study is brief enough but hardly clear enough. 
Because of what seems to be an undue effort to economize space, 
there is very little interpretive reading matter contained in this study. 
Hence, one finds many tables to examine, but little reading matter 
to interpret the tables. H. MELTzER. 








New Publications 359 


PERSONALLY APPROVED ENGLISH INSTRUCTION 





Yl} 
x 3 
% 
mh 
{ 
. 
bo * 
» oe 
bY 
: Bis. 
rom 
“had 
Sy 
A! 
, £ 
- iy 
. 
' 8 
A 
cp. 
a, 
q 


r Teaching English in High Schools, by Russell A. Sharp. Houghton 
' Mifflin Co., Boston: 1924. Pp. XI + 162. 
; This recent addition to the Riverside Educational Monographs is 
g described by its author as a “sheaf of gleanings from a dozen years 
. at the teacher’s desk.”’ It contains chapters on the preparation of the 
teacher of English, the objectives of English, the course of study, 
problems in teaching the classics and composition, reading and spelling, 
fads and reforms, segregation, and extra-curricular activities. For the 
most part the ideas presented are those to be expected from a teacher 
who likes to justify his ways by the use of the word pragmatic, who is 
suspicious of college, ex cathedra theory, who does not see how the 
public high school can do much for “the occasional literary genius,’ 
who feels most at home with the classics—the right ones, of course— \ 
y who sees no flaw in the theme-writing ritual, and who in general is . 
prejudiced in favor of the so-called safe and sane practices that 4 
s “candidates for the doctorate, test-and-measurement hounds, scale- g 
" makers, and the whole tribe of statisticians’’ are constantly seeking Wy 
. : to call in question. The purpose of the book is, admittedly, limited; 4 
7 and it fulfills its purpose. M. H. WIti1Na. Bi 
U +f 
‘ A TEACHER’s MANUAL IN EDUCATIONAL PsYCHOLOGY 
. Introduction to Educational Psychology, by Howard Taylor. Balti- 
more: Warwick & York, 1925. Pp. 172. 
J This book is an attempt to present in outline form some of the 
e essential facts and problems of educational psychology adapted for a 
c first course in the subject. Sixteen fairly well chosen topics are pre- 
" sented. The treatment of each consists of: (a) A very commendable 
] selection of reading references; (b) a skeletal outline of the topic; and 
(c) alist of problems. The treatment throughout is sketchy and brief. 
. Alternate pages are left blank for Teachers Notes leaving less than 80 
pages of material presented. In his attempt to maintain conciseness 
4 the author commits inexcusable faults as to say, for example, ‘‘ There 
r is no transfer of training,” strict interpretation of which would lead 
one to ask for justification of the manual itself. However, aside from i: 
such slips the work is commendable in content as well as in its attempt i 





to change the presentation of educational psychology from narrative 








360 The Journal of Educational Psychology 


description to effective application of the science to educational 
problems. The purpose is not new, but credit must be given for 
another effortful attempt. The manual will undoubtedly be of help 
to teachers of the subject who are incompletely prepared and who are 
trying to present the subject in problem or syllabus form. 

EpwWIN Maurice BaILor. 





Way Do CHILDREN LEAVE SCHOOL FOR INDUSTRY? 


The Intelligence of Continuation-school Children in Massachusetts, 


by L. Thomas Hopkins. Harvard University Press, Cambridge, 
1924. Pp. XIV + 132. 


The book is a painstaking report of an investigation undertaken 
to throw light on the factors which cause children to leave school 
for the industrial field. The study involves an analysis of 1200 
children in Massachusetts continuation schools, and 1980 children 
of the same ages in the regular schools. Both groups were selected 
to give as nearly as possible a random sample of all such children in 
Massachusetts. The chief instrument used for measuring the children 
was the Dearborn General Intelligence Test, Series II, which seems 
from the report to have been carefully given and accurately evaluated. 

The conclusions point clearly to the fact that the real reason 
why children leave school for industry is not economic necessity, or 
desire to work, but a lack of the intelligence requisite to the successful 
completion of the work, of the upper elementary and lower secondary 
schools. The book includes interesting discussions of the upper limit 
of the development of intelligence, sex differences, problems of accel- 
eration and retardation, and other phases of critical educational and 
psychological problems. Not the least interesting of the chapters is 
the one entitled ‘‘Suggestions for Improvement.’ The style when 
freed from the necessity of statistical proofs is readable enough to make 
the book of real value to teachers and education at large. 


E. Leona VINCENT. 








; 
’ 
' 
} 
} 





—_ tr, ~~ AD 





